Insights

Leveraging AI for Oil & Gas

September 24, 2025
Michael Hirsch
The Terminator is depicted in a pivotal movie scene, emphasizing its formidable presence and advanced technology.
No AIs or em dashes were leveraged in the writing of this article.

AI has been a hot topic in Oil & Gas for years now, and this blog is far from the first written about it.  I’m coming at this from the perspective of someone who’s spent significant time trying to extract value from AI in our industry.

In the hype cycle of new technologies I think we are firmly atop the Peak of Inflated Expectations at this point, though I personally am just past the Trough of Disillusionment.  There’s reason for excitement, but that excitement needs to be tempered with what AI is good at, and what it is not.

Stop saying AI when you mean LLM

I’ll use AI and LLM interchangeably in this article because the Artificial Intelligences that most civilians have access to are Large Language Models.  Technically LLMs are a subset of Deep Learning which is a subset of Machine Learning which are all under the umbrella of AI, but LLMs are far and away the most discussed these days.  What should that mean to you?  If you take away all the hype and marketing around AI, what you’re left with is a piece of software that puts words in order.  That’s it!  They can’t think, they just have mountains of training text written mostly by human beings that allow them to predict what word should come next in a sentence.  Does that dim your hype some?  It should.  But it should also help you form a mental model of how to extract value from this novel technology. 

a stylized and geometric representation of a bicyle

ChatGPT’s idea of how a bicycle works. *This does not count as using AI to write this article

AI doesn’t know what it doesn’t know

Artificial Intelligence.  Artificial yes.  Intelligent no.  LLMs do not know whether they’re right or wrong, but don’t worry, either way they will provide an answer 100% of the time.  They can be convinced to ask follow-up questions to fill in holes in their understanding, but that requires careful prompting and guidance.  They’re much more comfortable making stuff up, and so responses require fact checking if you don’t want to get caught out.  That’s not such a downside if you’re coding in Python, it’s easy to try and fail and iterate from there.  But you may not be aware of the assumptions that went into an answer unless you challenge the LLM to explicitly state those assumptions. 

In a moment of frustration with ChatGPT I instructed it to ‘always provide a % confidence’ that its answers were correct.  Just to double check, I asked for a % confidence for the prior three answers and it returned 95%, 98%, and 100%.  Pretty good, right?  Except all three answers were dead wrong AND I HAD TOLD IT SO. 

AI is only as good as the training data

LLMs require training data to function, and lots of it.  Typical models are quoted in the hundreds of gigabytes of training data, millions of books, tens of millions of GitHub posts.  This gives them a wide range of knowledge, but exactly what data goes into them is a trade secret.  A training data set must be sufficiently large, relevant, and accurate to be effective, and different models have different strengths and weaknesses depending on that training data.  I’ve found that they’re very good at Python code (there’s tons of public data to train on), but pretty lousy at DAX (which doesn’t get published as much on forums).  

a human walks through a warehouse, navigating between stacks of boxes arranged throughout the space.

Almost enough information to train an LLM

Microsoft has an opportunity to change the game here, and recent Power BI additions like DAX Query and TMDL seem focused on streamlining how LLMs can learn about and help with data modeling and data analytics.  Their private knowledge of your data hosted in Azure may be used to train Copilot to recreate your successes for other Microsoft customers.  I’m sure no drama will ensue and given Microsoft’s track record the courts will definitely not be involved.

Local AI’s usefulness is limited, but Public AI might not be secure

Some of our larger clients already have internal general LLMs stood up within Teams or on a website that have been trained on proprietary data.  Those efforts do not appear to have been effective outside of effectively lining the pockets of whoever sold their implementation.  Those efforts are at least in part to provide a secure LLM environment for staff to leverage that does not transmit corporate data to the OpenAIs and Groks of the world, but that offering does not appear robust enough to be worth pursuing… yet.

However, I do believe their concerns are valid. While it’s possible to add additional data sets to fine tune public models, that opens up your data to the owners of those models.  And given the piracy lawsuits leveled against Meta, Anthropic, OpenAI, and others, their claims that your input data will remain private and not used for future training of LLMs feel hollow.  My own experimentations with generating LLM’s for local use have yielded some success, but the return on that time investment remains to be seen.  For example, I fed  200+ scopes of work into a SOW writing LLM, which writes perfectly reasonable draft scopes of work for that tightly defined use case.  But to keep client data secure it runs locally on my machine, which makes response times slow, and it still requires coaching and input to yield useable results.

AI could soon spin straw into gold

While I’ve mostly tempered expectations in this article, I do see value in LLMs both as they currently exist and potential offerings in the near future.  Stunn.ai has done good work on check stub digitization by converting .pdf files to useable data, and ThoughtTrace’s technology for machine reading of land documents was recently acquired by Thompson Reuters.  These wins are on specific use cases, but a locally trainable LLM for unstructured data seems likely to be on the horizon.  That file folder full of LOS statements from virtual data rooms could soon feed a local model that you could ask about LOE trends across specific basins.  That would be a gamechanger for populating economic models for deal flow, kind of like a search function on steroids.  But the repeatability and effort required for an initiative like that leaves something to be desired today.  That said, as you read this, LLMs are the worst they will ever be at what they do.  It’s only up from here.

Have you or your company spun up any internal LLMs?  Have you found them useful?  Does your company have a security policy that prohibits the use of public-facing AI?  Am I way off base when it comes to the usefulness of this technology?  We’d love to hear about it!  Sound off in the comments.

 

Michael Hirsch is a reservoir engineer focused on project management, training, and novel technologies for Velocity Insight.  He’d be stoked to automate his job and spend more time mountain biking with his kids.

Let's discuss it further.

We love to hear your thoughts. Drop us a line or schedule a time to talk.

Learn With Us

The Oil and Gas data marketplace is constantly changing. Stay up-to-date, learn the latest trends and plan for the future with us.