{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "86274938-6380-4dce-9c1f-6b3c596e0d03",
   "metadata": {
    "tags": []
   },
   "source": [
    "# 11 minutes to petrelpy\n",
    "\n",
    "This is a short, tutorial-style introduction to `petrelpy`, using Petrel-esque\n",
    "import and export files and some simple dataframe manipulation to show the uses\n",
    "of the library for extracting data from Petrel export formats.\n",
    "\n",
    "## Installing as a library\n",
    "\n",
    "From your favorite terminal (and probably \n",
    "[virtual environment](https://docs.python.org/3/library/venv.html)), run\n",
    "\n",
    "```bash\n",
    "pip install git+https://github.com/frank1010111/petrelpy.git\n",
    "```\n",
    "\n",
    "## Loading well connection data\n",
    "\n",
    "A few useful Petrel exports are gslib (Full 3D property grids) and well\n",
    "connection files. Let's start with a well connection file. Funnily enough,\n",
    "thanks to how Eclipse feels (or doesn't feel) about unique well identifiers, you\n",
    "first need to extract the API number and well name from the Petrel well\n",
    "worksheet - and the measured depth of the heel, while you're at it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "9ca26cd4-9a13-4ca5-a205-18224b90e8c4",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>UWI</th>\n",
       "      <th>Name</th>\n",
       "      <th>Depth_heel</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>42056000000200</td>\n",
       "      <td>ALPHA UNIT 2</td>\n",
       "      <td>7000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>42113000001201</td>\n",
       "      <td>BRAVO 1</td>\n",
       "      <td>6600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>42532000004000</td>\n",
       "      <td>CHARLIE 2</td>\n",
       "      <td>8000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              UWI          Name  Depth_heel\n",
       "0  42056000000200  ALPHA UNIT 2        7000\n",
       "1  42113000001201       BRAVO 1        6600\n",
       "2  42532000004000     CHARLIE 2        8000"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "from petrelpy.wellconnection import (\n",
    "    process_well_connection_file,\n",
    "    get_trajectory_geomodel_columns,\n",
    "    COL_NAMES_TRAJECTORY,\n",
    ")\n",
    "\n",
    "# wcf_file = \"https://raw.githubusercontent.com/frank1010111/petrelpy/master/tests/data/test_wcf.wcf\"\n",
    "# heel_file = \"https://raw.githubusercontent.com/frank1010111/petrelpy/master/tests/data/test_heels.csv\"\n",
    "wcf_file = \"../../tests/data/test_wcf.wcf\"\n",
    "heel_file = \"../../tests/data/test_heels.csv\"\n",
    "\n",
    "well_heels = pd.read_csv(heel_file, index_col=0)\n",
    "well_heels"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b516feb4-aae5-4635-9c0f-da8d0bf5ab58",
   "metadata": {},
   "source": [
    "Now that we have that, let's get to those well connection files"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "dcca5a21-9406-4cfb-ac1e-b42755c126fb",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Bulk volume</th>\n",
       "      <th>Facies</th>\n",
       "      <th>Fraction</th>\n",
       "      <th>HCPV oil</th>\n",
       "      <th>Porosity - total</th>\n",
       "      <th>Water saturation</th>\n",
       "      <th>Zones (hierarchy)</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>42056000000200</th>\n",
       "      <td>27216432.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>0.00000</td>\n",
       "      <td>0.0000</td>\n",
       "      <td>0.051647</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>15.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>42113000001201</th>\n",
       "      <td>19656406.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>0.00071</td>\n",
       "      <td>17384.1934</td>\n",
       "      <td>0.058731</td>\n",
       "      <td>0.073408</td>\n",
       "      <td>15.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                Bulk volume  Facies  Fraction    HCPV oil  Porosity - total  \\\n",
       "42056000000200   27216432.0     2.0   0.00000      0.0000          0.051647   \n",
       "42113000001201   19656406.0     5.0   0.00071  17384.1934          0.058731   \n",
       "\n",
       "                Water saturation  Zones (hierarchy)  \n",
       "42056000000200          0.000000               15.0  \n",
       "42113000001201          0.073408               15.0  "
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "geomodel_cols = get_trajectory_geomodel_columns(wcf_file)\n",
    "all_cols = COL_NAMES_TRAJECTORY + geomodel_cols\n",
    "well_properties = (\n",
    "    process_well_connection_file(wcf_file, well_heels, col_names=all_cols)\n",
    "    .dropna(subset=\"GRID_I\")\n",
    "    .drop(columns=[\"MD_ENTRY\", \"GRID_I\", \"GRID_J\", \"GRID_K\"])\n",
    ")\n",
    "well_properties"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "14b441d9-9660-4d2c-865a-8efe8ef2d1c8",
   "metadata": {},
   "source": [
    "## Processing well data\n",
    "\n",
    "And now you have a pandas dataframe with all the power therein. You could, umm,\n",
    "get the hydrocarbon-filled porosity for the wells. And then get average total\n",
    "porosity and HCFP for each dominant facies."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "87cd6622-8b25-4a8e-9e00-80222c9552ff",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Porosity - total</th>\n",
       "      <th>hydrocarbon-filled porosity</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Facies</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2.0</th>\n",
       "      <td>0.051647</td>\n",
       "      <td>0.051647</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5.0</th>\n",
       "      <td>0.058731</td>\n",
       "      <td>0.054420</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        Porosity - total  hydrocarbon-filled porosity\n",
       "Facies                                               \n",
       "2.0             0.051647                     0.051647\n",
       "5.0             0.058731                     0.054420"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "(\n",
    "    well_properties.assign(\n",
    "        **{\n",
    "            \"hydrocarbon-filled porosity\": lambda props: props[\"Porosity - total\"]\n",
    "            * (1 - props[\"Water saturation\"])\n",
    "        }\n",
    "    )\n",
    "    .groupby(\"Facies\")[[\"Porosity - total\", \"hydrocarbon-filled porosity\"]]\n",
    "    .mean()\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9630adc0-0343-44cd-b47b-cabfd7d60b1f",
   "metadata": {
    "tags": []
   },
   "source": [
    "Okay, that was more like 5 minutes. I guess that means we should do something\n",
    "else!\n",
    "\n",
    "## Loading gslib data\n",
    "\n",
    "GSLIB is a tabular format for holding geomodel property extracts. Since these\n",
    "tend to be millions or billions of cells, the extracts are loaded into\n",
    "[dask dataframes](https://docs.dask.org/en/stable/dataframe.html) initially."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "b745d7c3-f9f6-4812-bfba-142bd3863fb5",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div><strong>Dask DataFrame Structure:</strong></div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>i_index</th>\n",
       "      <th>j_index</th>\n",
       "      <th>k_index</th>\n",
       "      <th>x_coord</th>\n",
       "      <th>y_coord</th>\n",
       "      <th>z_coord</th>\n",
       "      <th>Porosity</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>npartitions=60</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <td>int64</td>\n",
       "      <td>int64</td>\n",
       "      <td>int64</td>\n",
       "      <td>float64</td>\n",
       "      <td>float64</td>\n",
       "      <td>float64</td>\n",
       "      <td>float64</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<div>Dask Name: repartition, 2 expressions</div>"
      ],
      "text/plain": [
       "Dask DataFrame Structure:\n",
       "               i_index j_index k_index  x_coord  y_coord  z_coord Porosity\n",
       "npartitions=60                                                            \n",
       "                 int64   int64   int64  float64  float64  float64  float64\n",
       "                   ...     ...     ...      ...      ...      ...      ...\n",
       "...                ...     ...     ...      ...      ...      ...      ...\n",
       "                   ...     ...     ...      ...      ...      ...      ...\n",
       "                   ...     ...     ...      ...      ...      ...      ...\n",
       "Dask Name: repartition, 2 expressions\n",
       "Expr=Repartition(frame=ReadCSV(52327f0), new_partitions=60)"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from petrelpy.gslib import load_from_petrel\n",
    "\n",
    "gslib_file = \"../../tests/data/test_geomodel.gslib\"\n",
    "geomodel_properties = load_from_petrel(gslib_file)\n",
    "geomodel_properties"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ceab68bd-730e-4787-a157-687c6a13e4d8",
   "metadata": {},
   "source": [
    "The API for dask is pretty close to pandas, except it's lazy, and at the end you\n",
    "call the `compute` method to make it into a pandas dataframe. Let's calculate\n",
    "the average porosity for each $j$ index."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "7ca98af7-7467-4e5d-906c-56b9fbd00138",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "j_index\n",
       "2    0.058748\n",
       "3    0.058637\n",
       "1    0.059007\n",
       "4    0.055443\n",
       "Name: Porosity, dtype: float64"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "geomodel_properties.groupby(\"j_index\")[\"Porosity\"].mean().compute()"
   ]
  }
 ],
 "metadata": {
  "jupytext": {
   "formats": "ipynb,md:myst"
  },
  "kernelspec": {
   "display_name": "petrelpy",
   "language": "python",
   "name": "petrelpy"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}