Marriage of Analytics, Engineering First Principles Needed in Oil and Gas


The growing amount of data being generated in oil and gas operations, particularly shale operations, has prompted interest by some in the oil and gas industry to look at Big Data or data-driven analytics solutions to enhance efficiency and productivity. Oil and gas companies are not only having to contend with the volumes of data coming from shale activity, but the variability in shale play production, means oil and gas companies are finding that strategies and completion techniques that work on one well may not on a second well only 200 yards away.

The desire for CEOs to quickly access data to make decisions within the next month, not the next year, also is creating the need for companies to harness Big Data solutions to improve decision-making, said Keith Holdaway, upstream domain expert for the SAS Global Oil & Gas business unit and author of “Harness Oil and Gas Big Data with Analytics: Optimize Exploration and Production with Data-Driven Models”.

“The volumes have always been there, but the variety of structured and unstructured data coming from the growing number of sensors and valves on intelligent wells and velocity of data is very much a differentiator these days in today’s industry.” When Holdaway was working in the oil and gas industry, they might have had 3D data sets coming to them every six months. Now, it’s every two months.

Companies are becoming more receptive to the data-driven approach, but a reluctance still exists among some about whether the data coming out of these models should be trusted. This is particularly true among workers steeped in the first principles who have been in the industry a number of years.  

“They’re very tactile, visual people,” said Holdaway in an interview with Rigzone.

Holdaway worked in the oil and gas industry for 15 years as a geophysicist. Like many oil and gas professionals, Holdaway was educated in the first principles engineering concept. But he also taught himself computer programming for use in niche projects. After working in the Middle East and briefly in Houston, he moved to North Carolina, where he landed a job as a software developer in SAS’ research and development group.

“A lot of them are put off by the black box situation – which is not really a black box,” said Holdaway, who added that, if he was still in the industry, he also might be reluctant to trust results from data-driven models. But he believes that people can be won over if they are shown the algorithms, data mining and artificial intelligence techniques in action, and allowed to compare them to traditional visualization techniques.

In the past, companies tried to model raw data without exploring it and often without doing data quality control. Often, much of the data was real-time, batch, ratty or unstructured data coming from sensors, and companies haven’t been doing enough to generate analytical data warehousing, said Holdaway.

In his book, Holdaway said that the oil and gas industry needs a marriage of the first principles and data-driven models. “If you can take data and apply data mining techniques, you may find patterns that you haven’t thought about,” said Holdaway. Instead of a deterministic model, you may have a probability model. By using a data-driven approach instead of just modelling raw data, the likely success rate of different strategies for reengineering a well. By using data-driven modelling, the industry is now exploring data and turning raw data into actionable knowledge.

“We propose a different approach,” said Holdaway. “We’re not saying don’t make a $100 million decision but at least constrain data-driven models with first principle models.” Using this approach means not only that new questions will be asked, but new correlations and new factors will come into play for analysis.

“The model is living, not stagnant, and as you bring in additional and real-time data, the data feeds through and updates the models,” Holdaway commented. The term “Big Data” is not one to which a lot of people in the industry gravitate, but they understand they have lots of data from different sources. While the technology and processes are available, the willingness of people to adapt data-driven models is a major roadblock, said Holdaway, noting that he’s seen the same topics discussed several years in a row at an intelligent energy conference, but no progress has been made.  

Fiber optics and wireless data can be sent quickly from offshore rigs to an onshore collaborative center, but workers still exhibit a reluctance to break from traditional ways of interacting. The industry is still working in silos, with drilling engineers not talking to reservoir engineers, and efforts to collaborate are merely glorified meet and greets. Many IT guys are ”very protective” of data, and some workers are reluctant to introduce new technology or software into their operations.

Some in industry still prefer to stick with the first principles – noting that it’s worked for conventionals for the past century – but Holdaway points out the techniques used by the industry for conventional hydrocarbons has only recovered 32 percent of these resources available globally.

However, interest in using data mining, neural networks and artificial intelligence in oil and gas has grown as people from Generation X and Y train as data scientists and are open to applying different techniques, said Holdaway. The Society of Petroleum Engineers – of which Holdaway is a member – is a major advocate of data-driven models, and has established a group, TED2A, or the Petroleum Data Driven Analytics Group, to further advance data-driven models in the industry. Some of these guys are “way out there” – some even advocating a departure from the first principles and “to heck with what Newton or other great physicists have said”.

Burgeoning demand in emerging economies, rising competition for leasing rights, financial and intellectual capital, growing public scrutiny, and regulatory demands for transparency are a few of the challenges facing the global energy industry, according to an SAS white paper “Analytic Innovations Address New Challenges in the Oil and Gas Industry”.

Analytics can optimize the activities associated with exploration and production, including oilfield production forecasting, predictive asset maintenance, reservoir characterization and analytics for unconventional resource recovery. Supporting these activities through data-driven integrated planning is a proven way to deliver significant efficiency gains, according to SAS.

“All the new and improved measuring and monitoring devices – and all of the new information technology systems companies have invested in over the last 20 years – are capable of providing the data that can create new vistas of business acumen and operational effectiveness,” said SAS in the paper. “With the strategic use of analytics, oil and gas companies can achieve the high degrees of confidence needed to make groundbreaking, profitable decisions.”

Unconventional resources in particular could benefit from analytics. These capabilities work together to match high-tech drilling processes to the subterranean landscapes, decreasing uncertainties about what is beneath the surface and shifting the mindset to that of a manufacturer’s model, SAS said.  

One example of the application of analytics to unconventional resources is clustering, a data mining tool that categorizes and analyzes group data dimensions that demonstrate similar attribute characteristics. This data aids in analyzing wells because the clustering methodology classifies wells by dividing fields into selected areas and then grouping the most similar wells as its first sets of clusters. From there, the clusters’ averages are compared to the remaining wells to form a second set; this process is repeated until enough subsets exist to provide meaningful insights about the entire well portfolio, according to the SAS paper.

Clustering is particularly useful to the selection and use of proppants for hydraulic fracturing.

“The value of analytics to unconventional resource recovery may become the proverbial silver bullet, as there are so many opportunities to apply analytics to workflows, data and conventional processes,” said SAS.


Gaining the trust of people within its business units was the biggest challenge faced by Devon Energy’s data science team when it sought to establish its credibility within the company, said Beau Rollins, data scientist with Devon, at the SAS Day in Houston earlier this month on Big Data.

Several years ago, Devon realized that it was data-rich and knowledge-poor, and that the company had massive amounts of data that weren’t being utilized with a standard approach to decision-making. The relationships of permeability and porosity in conventional reservoirs are well-understood and based on the first principles of physics and mathematics, but applying principles used for conventional reservoirs to unconventional reservoirs has met with varying degrees of success. Sometimes, the rock behaves like it should according to the principles; other times, it doesn’t, said Rollins.

To change this approach, the company’s five-member data management and analytics group was established. This activity, which started two to three years ago, was initially focused on unconventional oil and gas activity, but is broadening to other areas of Devon’s operations, Rollins told Rigzone.

Rollins outlined for event attendees the process that Devon’s data science team uses and the lessons the team has learned in using data-driven analysis models. These models, which include machine-learning models and neural networks, don’t involve the knowledge of the physics-based relationships between the inputs.   

Given the wide swath of oil and gas projects within the company, there was no way for everyone on the five-person team to be experts in all these arenas, hence, the need for collaborating with subject matter experts within a business unit. Being able to communicate how these models worked – and to make people within Devon realize that their work wasn’t being replaced by these models – was critical.

“If all you do is learn the stat, you can build an accurate, sophisticated model, but if you can’t tell people how to use it, they won’t trust you.”

“We want to leverage these technologies to serve professionals, not to be their masters,” Rollins said. Data-driven algorithms can identify in data relationships that the team couldn’t find on its own, but just because a relationship exists doesn’t mean that it’s meaningful.

“Just because a model gave us a list of variables, the subject matter experts we consulted still had to buy into the data, and see what makes sense and what doesn’t.”

In the team’s experience, one of the hardest things in the process is nailing down what a team wants to predict.

“Everyone wants to predict everything, and it’s not possible,” said Rollins “Maintaining the original scope of a project can be difficult once everyone can see what can be done.”  

The process is more sophisticated than just loading variables into an equation – the user also must understand the algorithm’s limitations. One of these limitations is that it can’t tell what variable to pick when collinearity is present.

“In a modelling scenario, you have a two models – one model that you’re trying to understand, the other that is supposed to explain what is causing the variance being seen in the target model data,” Rollins explained. “If some variables are highly correlated with one another – such as one explanatory variable that is strongly associated with a second explanatory variable – can lead to collinearity, which confuses models in their selection of variables.”

Rollins likened testing different models to gladiators battling in an arena. Each gladiator, or different type of predictive models, has their own strength and advantage, and compete with one another to see which model is the best predictor. Once the champion model is selected, the scoring set is used to validate the model.

Using the most accurate model doesn’t meant that every well will be good, but will serve as a tool to minimize the ratio of bad to good wells. Even with software, one must understand the assumptions and how to build and validate models. If not, a great model that’s very, very risky can result, which can destroy credibility, said Rollins.

“We have to manage our expectations on the analytics side,” said Rollins. “We can’t just say that because we made a model, we’ll make hundreds of millions of dollars based on a model.”

The algorithms themselves are very sensitive to the numeric values in data. In their experience in using data-driven analytics, Devon actually encountered calibration issues and log responses that they wouldn’t have caught with their eyes. Because the algorithm treats numbers like numbers, it allowed them to find bad data.

For a project involving a relatively new play, the Devon team had a limited amount of production. The team agreed to build a model – although they wouldn’t be able to validate it – and suggested building a model with an integrated architecture that allow them data that always be maintained for study. Typically, an oil and gas industry employee might work at their regular job, spend three months collecting data and doing research for a project, then spend another year working at their normal job again. As a result, it can be easy to lose the discipline for keeping data clean.  

“Integrated data solutions allow us to always have a data set that’s always ready for analysis,” said Rollins, adding that Devon has a sister group in the data science department that handles data management issues.

This allows for workflows that will capture production stream data, hydraulic fracturing designs and regional geoscience data, and that can be integrated and restructured in a variety of ways with core visualization tools.

“Visualizing data and having one spot for all the disciplines’ technical data reduces the time to insight in the field and captures knowledge that would be lost by people rotating out and retiring,” said Rollins. “This integration allows for a running record to be kept of what is the relationships between the variables and what properties are important.”

In the past, the team never said no to a project because they were trying to build their reputation. Now, the team is getting to the point where it has to make choices and prioritize. Along the way, they have learned which areas in which its services fit, and where it’s more readily accepted.

“We’re starting to find our place after a lot of learning and trial and error,” said Rollins.



Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.