基于大语言模型的气井产量预测方法

doi:10.3969/j.issn.2096-1693.2025.02.027

石油科学通报 ›› 2025, Vol. 10 ›› Issue (5): 1056-1068. doi: 10.3969/j.issn.2096-1693.2025.02.027

基于大语言模型的气井产量预测方法

从梦泽¹^,²(), 薛亮¹^,³^,^*(), 韩江峡¹^,³, 苗得雨¹^,³, 刘月田¹^,³

1 中国免费靠逼视频(北京)油气资源与工程全国重点实验室，北京 102249
2 中国免费靠逼视频(北京)人工智能学院，北京 102249
3 中国免费靠逼视频(北京)石油工程学院，北京 102249

收稿日期:2025-03-25 修回日期:2025-05-23 出版日期:2025-10-15 发布日期:2025-10-21
通讯作者: *薛亮(1983年—)，博士，教授，博士生导师，主要从事智能油气藏渗流理论和开发技术研究，xueliang@dgqinyehang.com。
作者简介:从梦泽(2001年—)，在读硕士研究生，主要从事人工智能在油气藏开发中的应用研究，dreamze_c@student.dgqinyehang.com。
基金资助:
国家自然科学基金(52274048);北京市自然科学基金(3222037)

A forecasting method for gas well production based on large language model (LLM)

CONG Mengze¹^,²(), XUE Liang¹^,³^,^*(), HAN Jiangxia¹^,³, MIAO Deyu¹^,³, LIU Yuetian¹^,³

1 State Key Laboratory of Petroleum Resources and Engineering, China University of Petroleum, Beijing 102249, China
2 College of Artificial Intelligence, China University of Petroleum, Beijing 102249, China
3 College of Petroleum Engineering, China University of Petroleum, Beijing 102249, China

Received:2025-03-25 Revised:2025-05-23 Online:2025-10-15 Published:2025-10-21
Contact: *xueliang@dgqinyehang.com

摘要/Abstract

摘要：

准确可靠的产量预测是油气田高效开发与科学决策的关键环节。尽管机器学习方法已在该领域取得了显著进展，但现有模型通常依赖有限的历史生产数据从零训练，难以有效刻画产量序列中的复杂非线性动态、长期时间依赖性以及多变量间的高维交互关系，导致泛化能力不足、预测鲁棒性受限。为应对上述挑战，本文提出了一种基于大语言模型(Large Language Model, LLM)的气井产量预测新方法。该方法以预训练GPT-2模型为基础，通过几项关键策略实现时序预测适配：首先，对包含日产气量、油压、套压及生产时间的输入数据进行实例归一化，以促进知识迁移；其次，设计可训练的嵌入层，将数值型时序数据映射至LLM的语义嵌入空间，实现跨模态对齐；最后，采用冻结与微调相结合的参数高效迁移策略——冻结LLM的核心自注意力与前馈网络层以保留通用知识，同时微调位置编码与层归一化模块以增强对产量时序特性的建模能力。所构建的GPT4TS模型在四川盆地某海相碳酸盐岩气田实际生产数据上进行了系统验证。实验结果表明：对于开发历史较长的气井，GPT4TS显著优于传统LSTM模型——在单变量输入条件下，平均绝对百分比误差(MAPE)降低18.573%；在多变量输入条件下，MAPE进一步降低35.610%，充分体现了其在复杂趋势建模与多变量协同分析方面的优势。然而，对于投产时间较短的气井，由于历史数据不足以支撑LLM的有效微调，其预测精度反而低于LSTM。本研究不仅验证了大语言模型在油气产量预测中的应用潜力，也揭示了其性能对历史数据长度的依赖性，为实际工程中预测模型的合理选择提供了理论依据与实践指导。

关键词: 产量预测, 大语言模型, 机器学习, 时序数据, 模型微调

Abstract:

Accurate and reliable production forecasting is a critical component for the efficient development of oil and gas fields and supports informed scientific decision-making. Although machine learning methods have achieved significant progress in this domain, existing models are typically trained from scratch using limited historical production data, making it difficult to effectively capture the complex nonlinear dynamics, long-term temporal dependencies, and high-dimensional interactions among variables inherent in production time series. This often leads to insufficient generalization capacity and limited predictive robustness. To address these challenges, this study proposes a novel gas well production forecasting method based on large language models (LLMs). The approach builds upon a pre-trained GPT-2 architecture and incorporates several key adaptations to enable effective time-series prediction. First, the input data—including daily gas production rate, tubing pressure, casing pressure, and production time—are subjected to instance normalization to facilitate knowledge transfer. Second, a trainable embedding layer is designed to map numerical time-series data into the semantic embedding space of the LLM, thereby achieving cross-modal alignment between continuous signals and the discrete representation format required by the model. Third, a parameter-efficient transfer learning strategy combining freezing and fine-tuning is implemented: the core self-attention and feed-forward network layers of the LLM are frozen to preserve general-purpose knowledge acquired during pre-training, while the positional encoding and layer normalization modules are selectively fine-tuned to enhance the model’s ability to characterize temporal patterns specific to production dynamics. The resulting model, termed GPT4TS, is systematically evaluated on real-world production data from a marine carbonate gas reservoir in the Sichuan Basin. Experimental results show that for wells with long production histories, GPT4TS significantly outperforms the conventional LSTM model. Under univariate input, the mean absolute percentage error (MAPE) is reduced by 18.573% on average; under multivariate input, the MAPE reduction reaches 35.610%, demonstrating its superior capability in modeling complex trends and leveraging multi-variable synergies. However, for newly commissioned wells with short production histories, insufficient data hinders effective fine-tuning, leading to lower prediction accuracy compared to LSTM. This study not only validates the potential of large language models in petroleum production forecasting but also highlights their strong dependence on historical data length, providing both theoretical insights and practical guidance for model selection in real-world engineering applications.

Key words: large language model, production prediction, machine learning, time series data, model fine-tuning

中图分类号:

TE332
TP18

从梦泽, 薛亮, 韩江峡, 苗得雨, 刘月田. 基于大语言模型的气井产量预测方法[J]. 石油科学通报, 2025, 10(5): 1056-1068.

CONG Mengze, XUE Liang, HAN Jiangxia, MIAO Deyu, LIU Yuetian. A forecasting method for gas well production based on large language model (LLM)[J]. Petroleum Science Bulletin, 2025, 10(5): 1056-1068.

http://sykxtb.dgqinyehang.com/CN/Y2025/V10/I5/1056

图/表 9

图1 模型架构

Fig. 1 Model architecture

图2 Embedding示意图

Fig. 2 Embedding illustration

表1 网络超参数设置

Table 1 Network hyperparameter settings

模型参数	中文名称	GPT4TS参数取值	LSTM参数取值
Freq	频率	1D	1D
Number of GPT/LSTM layers	GPT/LSTM层数	2	2
Hidden size	隐藏层数量	/	128
Model Dimension	模型维度	768	/
Number of heads	注意力头数量	12	/
Sequence length	序列长度	30	30
Prediction length	预测长度	30	30
Batch size	批量大小	32	32
Epochs	迭代次数	100	100

图3 损失函数变化

Fig. 3 Change of loss function

图4 GPT4TS模型与LSTM模型产量预测效果对比

Fig. 4 Comparison of production prediction performance between GPT4TS and LSTM

表2 模型预测指标对比

Table 2 Comparison of model prediction metrics

井名	模型	单变量输入		多变量输入
井名	模型	RMSE	MAPE/%	RMSE	MAPE/%
Well-1	GPT4TS	3.471	4.473	2.654	3.593
Well-1	LSTM	3.842	5.779	3.801	5.574
Well-2	GPT4TS	2.178	2.938	1.704	2.238
Well-2	LSTM	2.189	3.695	2.137	3.888
Well-3	GPT4TS	2.902	3.922	2.192	2.957
Well-3	LSTM	3.358	4.444	3.217	4.186
平均	GPT4TS	2.850	3.778	2.183	2.929
平均	LSTM	3.130	4.639	3.052	4.549

图5 单变量与多变量输入的产量预测值和实际值拟合对比(Well-3)

Fig. 5 Comparison of fitting between predicted and actual production values for univariate and multivariate inputs (Well-3)

图6 开发历史较短条件下GPT4TS与LSTM的产量预测对比

Fig. 6 Production prediction performance under short development history: GPT4TS vs. LSTM

表3 开发历史较短条件下的预测指标对比

Table 3 Comparison of prediction metrics under short development history conditions

井名	模型	RMSE	MAPE/%
Well-A	GPT4TS	3.593	9.745
Well-A	LSTM	2.096	5.867
Well-B	GPT4TS	2.074	2.847
Well-B	LSTM	1.108	1.331
平均	GPT4TS	2.834	6.296
平均	LSTM	1.602	3.599

参考文献 42

[1]	陆家亮, 赵素平, 孙玉平, 等. 中国天然气产量峰值研究及建议[J]. 天然气工业, 2018, 38(1): 1-9.
	[LU J L, ZHAO S P, SUN Y P, et al. Natural gas production peaks in China: Research and strategic proposals[J]. Natural Gas Industry, 2018, 38(1): 1-9.]
[2]	刘巍, 刘威, 谷建伟. 基于机器学习方法的油井日产油量预测[J]. 石油钻采工艺, 2020, 42(1): 70-75.
	[LIU W, LIU W, GU J W. Oil production prediction based on a machine learning method[J]. Oil Drilling & Production Technology, 2020, 42(1): 70-75.]
[3]	何易东, 任岚, 赵金洲, 等. 页岩气藏体积压裂水平井产能有限元数值模拟[J]. 断块油气田, 2017, (4): 550-556.
	[HE Y D, REN L, ZHAO J Z, et al. Finite element numerical simulation of shale gas production of hydraulically fractured horizontal well with stimulated reservoir volume[J]. Fault-Block Oil and Gas Field, 2017, (4): 550-556.]
[4]	CHU H, DONG P, LEE W J. A deep-learning approach for reservoir evaluation for shale gas wells with complex fracture networks[J]. Advances in Geo-Energy Research, 2023, 7(1): 49-65.
[5]	ARPS J J. Analysis of decline curves[J]. Transactions of the AIME, 1945, 160(01): 228-247.
[6]	DUONG A N. Rate-decline analysis for fracture-dominated shale reservoirs[J]. SPE Reservoir Evaluation & Engineering, 2011, 14(03): 377-387.
[7]	SESHADRI J, MATTAR L, ASSOCIATES F. Comparison of power law and modified hyperbolic decline methods[C]. SPE Canada Unconventional Resources Conference, Calgary, 2010.
[8]	FETK M J. A simplified approach to water influx calculations-finite aquifer systems[J]. Journal of petroleum technology, 1971, 23(07): 814-828.
[9]	AGARWAL R G, GARDNER D C, KLEINSTEIBER S W, et al. Analyzing well production data using combined-type-curve and decline-curve analysis concepts[C]. SPE Annual Technical Conference and Exhibition, New Orleans, 1993.
[10]	郭子熙, 马骉, 张帅, 等. 深度学习在油气产量预测中的研究进展与技术展望[J]. 天然气工业, 2024, 44(9): 88-98.
	[GUO Z X, MA B, ZHANG S, et al. Research status and prospects of deep learning in oil and gas production prediction[J]. Natural Gas Industry, 2024, 44(9): 88-98.]
[11]	李菊花, 陈晨, 肖佳林. 基于随机森林算法的页岩气多段压裂井产量预测[J]. 长江大学学报(自然科学版), 2020, 17(4): 34-38, 7.
	[LI J H, CHEN C, XIAO J L. Production prediction of shale gas multi-stage fracturing wells based on random forest algorithm[J]. Journal of Yangtze University (Natural Science Edition), 2020, 17(4): 34-38, 7.]
[12]	何佑伟, 贺质越, 汤勇, 等. 基于机器学习的页岩气井产量评价与预测[J]. 石油钻采工艺, 2021, 43(4): 518-524.
	[HE Y W, HE Z Y, TANG Y, et al. Shale gas well production evaluation and prediction based on machine learning[J]. Oil Drilling & Production Technology, 2021, 43(4): 518-524.]
[13]	韩珊, 车明光, 苏旺, 等. 四川盆地威远区块页岩气单井产量预测方法及应用[J]. 特种油气藏, 2022, 29(6): 141-149. doi: 10.3969/j.issn.1006-6535.2022.06.018
	[HAN S, CHE M G, SU W, et al. Prediction method and application of single well production of shale gas in Weiyuan block, Sichuan Basin[J]. Special Oil & Gas Reservoirs, 2022, 29(6): 141-149.]
[14]	祝元宠, 咸玉席, 李清宇, 等. 基于大数据的页岩气产能预测[J]. 油气井测试, 2019, 28(1): 1-6.
	[ZHU Y C, XIAN Y X, LI Q Y, et al. Shale gas productivity forecast based on big data[J]. Well Testing, 2019, 28(1): 1-6.]
[15]	吴新根, 葛家理. 应用人工神经网络预测油田产量[J]. 石油勘探与开发, 1994, (3): 75-78, 131.
	[WU X G, GE J L. Application of artificial neural network to predict oilfield production[J]. Petroleum Exploration and Development, 1994, (3): 75-78, 131.]
[16]	李彦尊, 白玉湖, 陈桂华, 等. 基于人工神经网络方法的页岩油气产量预测新技术——以美国Eagle Ford页岩油气田为例[J]. 中国海上油气, 2020, 32(4): 104-110.
	[LI Y Z, BAI Y H, CHEN G H, et al. ANN method based on novel technology for production prediction of shale oil and gas: A case study in Eagle Ford[J]. China Offshore Oil and Gas, 2020, 32(4): 104-110.]
[17]	林魂, 孙新毅, 宋西翔, 等. 基于改进人工神经网络的页岩气井产量预测模型研究[J]. 油气藏评价与开发, 2023, 13(4): 467-473.
	[LIN H, SUN X Y, SONG X X, et al. A model for shale gas well production prediction based on improved artificial neural network[J]. Reservoir Evaluation and Development, 2023, 13(4): 467-473.]
[18]	LEE K, LIM J, YOON D, et al. Prediction of shale-gas production at duvernay formation using deep-learning algorithm[J]. SPE Journal, 2019, 24(06): 2423-2437.
[19]	王洪亮, 林霞, 蒋丽维, 等. 基于聚类及长短时记忆神经网络预测油田产量[J]. 石油科学通报, 2024, 9(1): 62-72.
	[WANG H L, LIN X, JIANG L W, et al. An oilfield production prediction method based on clustering and long short-term memory neural network[J]. Petroleum Science Bulletin, 2024, 9(1): 62-72.]
[20]	樊冬艳, 杨灿, 孙海, 等. 基于时间序列相似性与机器学习方法的页岩气井产量预测[J]. 中国免费靠逼视频学报(自然科学版), 2024, 48(3): 119-126.
	[FAN D Y, YANG C, SUN H, et al. Shale gas well production forecasting based on time sequence similarity and machine learning methods[J]. Journal of China University of Petroleum (Edition of Natural Science), 2024, 48(3): 119-126.]
[21]	YANG R, LIU X, YU R, et al. Long short-term memory suggests a model for predicting shale gas production[J]. Applied Energy, 2022, 322: 119415.
[22]	李媛, 郭大立, 康芸玮. 融合注意力机制的煤层气产量动态预测[J]. 科学技术与工程, 2023, 23(2): 550-557.
	[LI Y, GUO D L, KANG Y W. Dynamic prediction of coalbed methane production with attention mechanisms[J]. Science Technology and Engineering, 2023, 23(2): 550-557.]
[23]	郭建春, 任文希, 曾凡辉, 等. 基于卷积-长短记忆神经网络的页岩气井短期产量预测与概率性评价[J]. 钻采工艺, 2025, 48(1): 130-137.
	[GUO J C, REN W X, ZENG F H, et al. Short-term production prediction and probability assessment of shale gas wells based on convolution long short-term memory neural network[J]. Drilling & Production Technology, 2025, 48(1): 130-137.]
[24]	韩克宁, 王伟, 樊冬艳, 等. 基于产量递减与LSTM耦合的常压页岩气井产量预测[J]. 油气藏评价与开发, 2023, 13(5): 647-656.
	[HAN K N, WANG W, FAN D Y, et al. Production forecasting for normal pressure shale gas wells based on coupling of production decline method and LSTM model[J]. Reservoir Evaluation and Development, 2023, 13(5): 647-656.]
[25]	任文希, 段又菁, 郭建春, 等. 物理—数据协同驱动的页岩气井产量预测方法[J]. 天然气工业, 2024, 44(9): 127-139.
	[REN W X, DUAN Y J, GUO J C, et al. Physics-informed data-driven shale gas well production prediction method[J]. Natural Gas Industry, 2024, 44(9): 127-139.]
[26]	韩江峡, 薛亮, 位云生, 等. 基于深度自回归神经网络的多井产量概率预测[J]. 石油科学通报, 2024, 9(4): 679-689.
	[HAN J X, XUE L, WEI Y S, et al. Multiple well production rate probabilistic forecasting using deep autoregressive recurrent networks[J]. Petroleum Science Bulletin, 2024, 9(4): 679-689.]
[27]	武娟, 罗仁泽, 雷璨如, 等. 基于大语言模型的致密砂岩储层测井含水饱和度预测[J]. 天然气工业, 2024, 44(9): 77-87.
	[WU J, LUO R Z, LEI C R, et al. Physics-informed data-driven shale gas well production prediction method[J]. Natural Gas Industry, 2024, 44(9): 77-87.]
[28]	XUE L, LI D, DOU H. Artificial intelligence methods for oil and gas reservoir development: current progresses and perspectives[J]. Advances in Geo-Energy Research, 2023, 10(1): 65-70.
[29]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]. Advances in Neural Information Processing Systems, 2017.
[30]	DEVLIN J, CHANG M W, LEE K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[C]. Proceedings of naacL-HLT, 2019.
[31]	ZHOU T, NIU P, WANG X, et al. One fits all: Power general time series analysis by pretrained LM[J]. Advances in neural information processing systems, 2023, 36: 43322-43355.
[32]	JIN M, ZHANG Y, CHEN W, et al. Position paper: What can large language models tell us about time series analysis[C]. Forty-first International Conference on Machine Learning, 2024.
[33]	OPENAI, ACHIAM J, ADLER S, et al. GPT-4 technical report[J]. arXiv preprint arXiv: 2303.08774, 2023.
[34]	CHOWDHERY A, NARANG S, DEVLIN J, et al. PaLM: Scaling language modeling with pathways[J]. Journal of Machine Learning Research, 2023, 24(240): 1-113.
[35]	TOUVRON H, LAVRIL T, IZACARD G, et al. LLaMA: Open and efficient foundation language models[J]. arXiv preprint arXiv: 2302.13971, 2023.
[36]	BROWN T B, MANN B, RYDER N, et al. Language models are few-shot learners[J]. Advances in neural information processing systems, 2020, 33: 1877-1901.
[37]	SANH V, WEBSON A, RAFFEL C, et al. Multitask prompted training enables zero-shot task generalization[J]. arXiv preprint arXiv: 2110.08207, 2021.
[38]	WEI J, WANG X, SCHUURMANS D, et al. Chain-of-thought prompting elicits reasoning in large language models[J]. Advances in neural information processing systems, 2022, 35: 24824-24837.
[39]	KIM T, KIM J, TAE Y, et al. Reversible instance normalization for accurate time-series forecasting against distribution shift[C]. International Conference on Learning Representations, 2021.
[40]	BA J L, KIROS J R, HINTON G E. Layer normalization[J]. arXiv preprint arXiv: 1607.06450, 2016.
[41]	LU K, GROVER A, ABBEEL P, et al. Frozen pretrained transformers as universal computation engines[C]. Proceedings of the AAAI conference on artificial intelligence, 2022.
[42]	HOULSBY N, GIURGIU A, JASTRZEBSKI S, et al. Parameter-efficient transfer learning for NLP[C]. International conference on machine learning. PMLR, 2019.

选择文件类型/文献管理软件名称

选择包含的内容

基于大语言模型的气井产量预测方法

A forecasting method for gas well production based on large language model (LLM)

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 9

参考文献 42

相关文章 15

编辑推荐

Metrics

本文评价

[1]	曾倩, 李小波, 刘兴邦, 杨明澔, 刘月田, 修诗玮. 油气勘探开发中生成式大模型的多维能力评估与智能转型技术路径研究——以DeepSeek为例[J]. 石油科学通报, 2025, 10(5): 1083-1098.
[2]	陈君青, 杨晓斌, 张潇, 王玉莹, 火勋港, 姜福杰, 庞宏, 施砍园, 马奎友. 页岩力学性质研究中机器学习的应用：现状、挑战与展望[J]. 石油科学通报, 2025, 10(5): 849-877.
[3]	张博维, 刘月田, 黄晋江, 薛亮, 宋来明. 基于TPE优化的时空图神经网络油藏产量动态预测[J]. 石油科学通报, 2025, 10(5): 983-996.
[4]	王宵宇；廖广志；黄文松；刘海山；孔详文；赵子斌. 基于机器学习的页岩总有机碳含量评价方法[J]. , 2025, 10(2): 392-403.
[5]	范青青；刘达东；许名扬；蒋欣怡；陈祎；冯霞；杜威；刘冀蓬；唐梓俊；赵帅. 黔北地区下古生界页岩有机质孔隙结构特征及三维重构[J]. , 2025, 10(2): 361-377.
[6]	韩江峡；薛亮；位云生；齐亚东；王军磊；陈海洋；刘月田. 基于深度自回归神经网络的多井产量概率预测[J]. , 2024, 9(4): 679-689.
[7]	卢运虎；金衍；王汉青；耿智. 井漏风险层位钻前智能识别方法研究[J]. , 2024, 9(4): 574-585.
[8]	马天寿；张东洋；陆灯云；谢祥锋；刘阳. 地质力学参数智能预测技术进展与发展方向[J]. , 2024, 9(3): 365-382.
[9]	胡诗梦；盛茂；秦世勇；任登峰；彭芬；冯觉勇. 基于钻录测数据驱动的储层可压性无监督聚类模型及其压裂布缝优化[J]. , 2023, 8(6): 767-774.
[10]	王一新；陆诗建；李卫东；滕霖. 基于物理模型驱动的机器学习方法预测超临界二氧化碳管道最大泄漏速率[J]. , 2023, 8(1): 102-111.
[11]	胡晓东；涂志勇；罗英浩；周福建；李宇娇；刘健；易普康. 拟合函数—神经网络协同的页岩气井产能预测模型[J]. , 2022, 7(3): 394-405.
[12]	刘珊珊；汪志明. 基于机器学习方法的多采样点储层粒度剖面预测[J]. , 2022, 7(1): 93-105.
[13]	罗刚；肖立志；史燕青；邵蓉波. 基于机器学习的致密储层流体识别方法研究[J]. , 2022, 7(1): 24-33.
[14]	徐磊；侯磊；朱振宇；徐震；雷婷；李雨；李强；陈秀芹；王九玲；陈星燃. 基于两层分解算法和改进SVM的油田采出水处理效果预测研究[J]. , 2021, 6(3): 505-515.
[15]	申屠俊杰；林伯韬；陆吉. 深水浅层浅水流灾害风险评价与防灾方法研究[J]. , 2021, 6(3): 451-464.

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

基于大语言模型的气井产量预测方法

A forecasting method for gas well production based on large language model (LLM)

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 9

参考文献 42

相关文章 15

编辑推荐

Metrics

本文评价