11 minutes to petrelpy

This is a short, tutorial-style introduction to petrelpy, using Petrel-esque import and export files and some simple dataframe manipulation to show the uses of the library for extracting data from Petrel export formats.

Installing as a library

From your favorite terminal (and probably virtual environment), run

pip install git+https://github.com/frank1010111/petrelpy.git

Loading well connection data

A few useful Petrel exports are gslib (Full 3D property grids) and well connection files. Let’s start with a well connection file. Funnily enough, thanks to how Eclipse feels (or doesn’t feel) about unique well identifiers, you first need to extract the API number and well name from the Petrel well worksheet - and the measured depth of the heel, while you’re at it.

import pandas as pd
from petrelpy.wellconnection import (
    process_well_connection_file,
    get_trajectory_geomodel_columns,
    COL_NAMES_TRAJECTORY,
)

# wcf_file = "https://raw.githubusercontent.com/frank1010111/petrelpy/master/tests/data/test_wcf.wcf"
# heel_file = "https://raw.githubusercontent.com/frank1010111/petrelpy/master/tests/data/test_heels.csv"
wcf_file = "../../tests/data/test_wcf.wcf"
heel_file = "../../tests/data/test_heels.csv"

well_heels = pd.read_csv(heel_file, index_col=0)
well_heels
UWI Name Depth_heel
0 42056000000200 ALPHA UNIT 2 7000
1 42113000001201 BRAVO 1 6600
2 42532000004000 CHARLIE 2 8000

Now that we have that, let’s get to those well connection files

geomodel_cols = get_trajectory_geomodel_columns(wcf_file)
all_cols = COL_NAMES_TRAJECTORY + geomodel_cols
well_properties = (
    process_well_connection_file(wcf_file, well_heels, col_names=all_cols)
    .dropna(subset="GRID_I")
    .drop(columns=["MD_ENTRY", "GRID_I", "GRID_J", "GRID_K"])
)
well_properties
Bulk volume Facies Fraction HCPV oil Porosity - total Water saturation Zones (hierarchy)
42056000000200 27216432.0 2.0 0.00000 0.0000 0.051647 0.000000 15.0
42113000001201 19656406.0 5.0 0.00071 17384.1934 0.058731 0.073408 15.0

Processing well data

And now you have a pandas dataframe with all the power therein. You could, umm, get the hydrocarbon-filled porosity for the wells. And then get average total porosity and HCFP for each dominant facies.

(
    well_properties.assign(
        **{
            "hydrocarbon-filled porosity": lambda props: props["Porosity - total"]
            * (1 - props["Water saturation"])
        }
    )
    .groupby("Facies")[["Porosity - total", "hydrocarbon-filled porosity"]]
    .mean()
)
Porosity - total hydrocarbon-filled porosity
Facies
2.0 0.051647 0.051647
5.0 0.058731 0.054420

Okay, that was more like 5 minutes. I guess that means we should do something else!

Loading gslib data

GSLIB is a tabular format for holding geomodel property extracts. Since these tend to be millions or billions of cells, the extracts are loaded into dask dataframes initially.

from petrelpy.gslib import load_from_petrel

gslib_file = "../../tests/data/test_geomodel.gslib"
geomodel_properties = load_from_petrel(gslib_file)
geomodel_properties
Dask DataFrame Structure:
i_index j_index k_index x_coord y_coord z_coord Porosity
npartitions=60
int64 int64 int64 float64 float64 float64 float64
... ... ... ... ... ... ...
... ... ... ... ... ... ... ...
... ... ... ... ... ... ...
... ... ... ... ... ... ...
Dask Name: repartition, 2 expressions

The API for dask is pretty close to pandas, except it’s lazy, and at the end you call the compute method to make it into a pandas dataframe. Let’s calculate the average porosity for each \(j\) index.

geomodel_properties.groupby("j_index")["Porosity"].mean().compute()
j_index
2    0.058748
3    0.058637
1    0.059007
4    0.055443
Name: Porosity, dtype: float64