GSH Gets Down to Business: CEGAL: Machine Learning Assisted Lithofacies Workflows: ... - Sep 21st

Complete Title:  Machine learning assisted lithofacies workflows: using model performance metrics and Shapley values to guide geoscientists during model training.


Online Presentation - you must register to receive access information

NOTE: You must login to register

Presented By: 
Thomas Bartholomew Grant, Cegal AS
Co-Author: Hilde Tveit Håland, Cegal AS

Identification of subsurface lithologies from well logs (geochemical, petrophysical or geophysical) is critical during exploration for hydrocarbons, geothermal reservoirs, or mineral/metal prospecting. Traditionally, subsurface lithologies are characterized using time consuming and manual processing of log data requiring substantial geoscience knowledge and experience. The use of machine learning is gaining acceptance as a viable way of rapidly producing accurate lithofacies interpretations on large amounts of data. However, there is often a knowledge gap between experienced geoscientists and data scientists as well as a lack transparency of machine learning predictions that inhibits the potential for more widespread use of these techniques. This work focuses on using two machine learning assisted workflows for geoscientists to generate lithofacies logs for hydrocarbon exploration. The machine learning algorithms are used to speed up the generation of accurate lithofacies logs and a set of metrics are explored for their use in guiding a geoscientist (rather than a data scientist) during model training. The aim is to develop machine learning assisted workflows that are reliable, understandable and easily deployable using open source frameworks in a way that a domain expert without expertise in machine learning or data science can easily run, re-run and train a model for a specific problem or area.

Workflow
The first workflow uses K-means (KM) clustering for deriving a new lithofacies log from a set of well logs. In areas with few wells or limited, missing or poor-quality log data the workflow can use the well with the best quality or most complete data set to build a “reference” lithofacies log. The second workflow applies an existing lithofacies log (manually created or machine learning derived – workflow 1) onto newly drilled wells to rapidly integrate the new data into existing geomodels or to apply to wells with poor or missing data based on commonly available logs to the well with the reference lithofacies log. We used Random Forest (RF), Support Vector Machine (SVM) or a Multi-Layer Perceptron (MLP) for applying an existing lithofacies log to other wells. All three algorithms have been effectively used in several case studies (e.g. Bhattacharya et al. 2016, Merembayev et al 2021).

Classification model outputs include the lithofacies class predicted as well as the probability of each class for each sample. Plots of the input log traces the predicted lithofacies log and probability logs can intuitively be used to determine which model provides the most geologically reasonable predictions.
 
The train and test accuracy scores, log loss, and confusion matrix are used as metrics to evaluate model performance and to compare different models (RF, SVM or MLP). The confusion matrix was found to be particularly useful in cases where certain lithofacies were or were not discernible. For example, it may be beneficial to use a model that is effective at distinguishing between one or more reservoir lithofacies but less effective for non-reservoir lithofacies. SHapley Additive exPlanations, SHAP (Lundberg and Lee 2017), are used to determine the contributions of different features (logs) on the prediction and change the selection of input logs accordingly. As the workflows are fast to implement, a geoscientist can quickly iterate through the workflow(s) using different input logs or depth intervals to train the models and see impact on the model performance / outcome.

Results
On case studies in the North Sea, and Norwegian Sea, it was found that all three classification algorithms tend to have comparable accuracies and log losses, but may differ slightly in which input logs drive each model (from SHAP values) and which lithofacies are distinguishable (confusion matrix). The work highlights the need for transparent machine learning solutions for displaying and evaluating model predictions in ways that are intuitive and understandable for domain experts, in this case geoscientists. Prediction probabilities are particularly useful in geosciences and for use in other workflows such as geomodelling, and reserve volume estimations.

Conclusions
This work highlights the need for transparent machine learning solutions that emulate traditional workflows that enable experts to think like a geoscientists but gain the benefits of speed and accuracy from machine learning. We found that the workflows were able to significantly speed up the development of a lithofacies logs and reduce time to first oil.

References
Bhattacharya, S., Carr, T. R., & Pal, M. (2016). Comparison of supervised and unsupervised approaches for mudstone lithofacies classification: Case studies from the Bakken and Mahantango-Marcellus Shale, USA. Journal of Natural Gas Science and Engineering, 33, 1119-1133.

Lundberg, S. M., & Lee, S. I. (2017). A Unified Approach to Interpreting Model Predictions. Advances in Neural Information Processing Systems, 30, 4765-4774.

Merembayev, T., Kurmangaliyev, D., Bekbauov, B., & Amanbek, Y. (2021). A Comparison of Machine Learning Algorithms in Predicting Lithofacies: Case Studies from Norway and Kazakhstan. Energies, 14(7), 1896.
(use Arial 9pt normal)

Presenter Biography: Thomas Grant, Cegal AS
Thomas Grant is a Senior Data & Digitalization Consultant at the Stavanger; Norway office of Cegal AS, a leading provider of cloud, consulting, and software products for the Oil & Gas industry. He holds a Ph.D. (2014), and Masters (2010) in geology and geochemistry from the Free University of Berlin and University of Bristol respectively, after which he worked as a postdoctoral research fellow at the Norwegian University for Science and Technology (NTNU). For the last four years, he has been working as a consultant on digitalization and data science projects spanning oil & gas, mineral prospecting, finance, and utility industries. He specializes in python programming, scientific computing, and machine learning solutions to facilitate data driven decisions. Thomas has a wide interest in science, openness in AI, new cloud technologies, and has continued ties to research projects at NTNU and the Geological Survey of Norway. 

**This presentation will be in a fully commercial format, to deliver information on the presenting company’s service or product.  The GSH is not responsible for the content presented, nor does it endorse, warrant, or otherwise validate the material/service or product presented.

When
9/21/2021 12:00 PM - 1:00 PM
Central Daylight Time

Sign In