variography¶
Variography¶
%matplotlib inline
# loading packages
import pandas as pd
import numpy as np
geolime package can be imported as such:
import geolime as glm
Data¶
A widely distributed dataset is available for testing. Here we load some data and display the first five rows.
walker_lake_path = "../../tests/data/walker_dataset/walker.csv"
data = pd.read_csv(walker_lake_path, sep= ' ')
data.head()
id | x | y | u | v | t | |
---|---|---|---|---|---|---|
0 | 1 | 11 | 8 | 0.0 | 1.000000e+31 | 2 |
1 | 2 | 8 | 30 | 0.0 | 1.000000e+31 | 2 |
2 | 3 | 9 | 48 | 224.4 | 1.000000e+31 | 2 |
3 | 4 | 8 | 69 | 434.4 | 1.000000e+31 | 2 |
4 | 5 | 9 | 90 | 412.1 | 1.000000e+31 | 2 |
Experimental semivariogram¶
Prepare arrays to be used as input for experimental semivariogram computation.
# extract coordinates and values
coords_df = data[['x', 'y']]
coords = coords_df.to_numpy()
values_df = data[['u']]
values = values_df.to_numpy()
Define input lags and compute the experimental semivariogram.
# define input lags
lags, tol = glm.generate_lags(lag=13.34, plag=50, nlags=10)
# compute variogram
vario_exp_az_north = glm.variogram(coords=coords, values=values, lags=lags,
tol=tol, geographic_azimuth=0, dip=0, pitch=0,
atol=45, slice_width=np.nan, slice_height=np.nan,
method="semivariogram")
# compute variogram
vario_exp_az_east = glm.variogram(coords=coords, values=values, lags=lags,
tol=tol, geographic_azimuth=90, dip=0, pitch=0,
atol=45, slice_width=np.nan, slice_height=np.nan,
method="semivariogram")
The resulting dataframe contains the following columns
vario_exp_az_north.head()
rank | npairs | lag | avgdist | vario | indices | |
---|---|---|---|---|---|---|
0 | 0 | 9 | 0.00 | 3.54258 | 38432.551667 | [[37, 354], [54, 371], [63, 375], [234, 420], ... |
1 | 1 | 1247 | 13.34 | 13.4394 | 56585.788464 | [[1, 2], [2, 202], [3, 200], [3, 401], [4, 200... |
2 | 2 | 2319 | 26.68 | 26.7378 | 71857.304101 | [[0, 1], [0, 16], [1, 17], [1, 202], [1, 400],... |
3 | 3 | 2594 | 40.02 | 40.6949 | 85625.567637 | [[0, 2], [0, 17], [1, 3], [1, 18], [1, 196], [... |
4 | 4 | 3132 | 53.36 | 53.557 | 91548.275431 | [[0, 32], [0, 198], [0, 201], [0, 202], [0, 39... |
vario_exp_az_east.head()
rank | npairs | lag | avgdist | vario | indices | |
---|---|---|---|---|---|---|
0 | 0 | 186 | 0.00 | 4.67994 | 38188.530618 | [[4, 403], [18, 345], [19, 347], [19, 348], [2... |
1 | 1 | 1253 | 13.34 | 14.6557 | 71891.404984 | [[2, 17], [2, 400], [3, 196], [3, 202], [3, 34... |
2 | 2 | 2005 | 26.68 | 27.0867 | 94645.106242 | [[0, 15], [0, 209], [0, 212], [1, 15], [1, 16]... |
3 | 3 | 2217 | 40.02 | 40.6145 | 94929.345555 | [[0, 30], [0, 31], [0, 210], [0, 349], [0, 350... |
4 | 4 | 2410 | 53.36 | 53.6488 | 98562.561701 | [[0, 32], [0, 45], [0, 208], [0, 211], [0, 250... |
vario_exp = [vario_exp_az_north, vario_exp_az_east]
It also contains useful metadata
vario_exp_az_north.attrs
{'method': 'semivariogram', 'geographic_azimuth': 0, 'dip': 0, 'pitch': 0, 'tol': 6.67, 'atol': 45.0, 'trend': 90.0, 'plunge': 0.0, 'variable_mean': 435.2987234042554, 'variable_var': 89738.0559132639, 'number_of_lags': 10, 'input_lags': array([ 0. , 13.34, 26.68, 40.02, 53.36, 66.7 , 80.04, 93.38, 106.72, 120.06]), 'slice_width': nan, 'slice_height': nan, 'dim': 2, 'directional_vector': array([6.123234e-17, 1.000000e+00])}
We can plot the experimental semivariogram dataframe with the following command:
glm.plot_semivariogram(variograms=vario_exp, model=None, display_npairs=True)
(<Figure size 432x288 with 1 Axes>, <AxesSubplot:xlabel='Average distance', ylabel='Semivariance'>)
VarioMap and VarioContour¶
A VarioMap and a VarioContour can also be achieved.
glm.vario_map(coords = coords,
values=values,
lags=np.arange(start=0, stop=150, step=15),
tol=50,
n_az=20,
atol=20)
glm.vario_contour(coords = coords,
values=values,
lags=np.arange(start=0, stop=150, step=15),
tol=50,
n_az=20,
atol=20)
Variogram model¶
Experimental semivariogram can be fitted the following way. We first need to define a target grid.
# create a grid
target_grid = glm.create_grid_2d(origin=[0, 0], n_dim=[30, 30], cell_size=[5, 5], angle=[0.0, 0.0, 0.0])
We also need to define a covariance model for fitting
# define a covariance model
cov = glm.Nugget() + 0.5 * glm.Spherical(2)
print(cov)
Model with 2 components Component 1 : Variance : 1 Covariance type : Nugget Component 2 : Variance : 0.5 Covariance type : Spherical Scales : [1. 1.] Angles : 0.0 Rotation matrix [[ 6.123234e-17 -1.000000e+00] [ 1.000000e+00 6.123234e-17]] Total sill = 1.5
We can fit the experimental variogram
# fit model
cov_model = glm.model_fit(variograms=[vario_exp_az_north, vario_exp_az_east], cov=cov, constraints=None)
print(cov_model)
Model with 2 components Component 1 : Variance : 38554.06094994866 Covariance type : Nugget Component 2 : Variance : 56089.38951761728 Covariance type : Spherical Scales : [66.4834731 30.59535845] Angles : 10.424245153273477 Rotation matrix [[ 0.18093533 -0.98349499] [ 0.98349499 0.18093533]] Total sill = 94643.45046756594
glm.plot_semivariogram([vario_exp_az_north, vario_exp_az_east],
model=cov_model,
model_angles=[{"azi":0.,"dip":0.,"pitch":0.,"label":"N00"}],
display_npairs=True)
(<Figure size 432x288 with 1 Axes>, <AxesSubplot:xlabel='Average distance', ylabel='Semivariance'>)
print(cov_model)
Model with 2 components Component 1 : Variance : 38554.06094994866 Covariance type : Nugget Component 2 : Variance : 56089.38951761728 Covariance type : Spherical Scales : [66.4834731 30.59535845] Angles : 10.424245153273477 Rotation matrix [[ 0.18093533 -0.98349499] [ 0.98349499 0.18093533]] Total sill = 94643.45046756594
We apply model regularization with the following call:
pattern_generator = glm.GridPatternGenerator(target_grid, [5, 5])
mr = glm.model_regularize(vario_exp_az_north, cov, pattern_generator, nlag=10, border_ratio=1, normalize=True)
mr
lag | vario | avgdist | npairs | |
---|---|---|---|---|
0 | 0.00 | 0.000000 | 0.00 | 1.0 |
1 | 13.34 | 0.251511 | 13.34 | 1.0 |
2 | 26.68 | 0.559844 | 26.68 | 1.0 |
3 | 40.02 | 0.810363 | 40.02 | 1.0 |
4 | 53.36 | 0.963194 | 53.36 | 1.0 |
5 | 66.70 | 0.999988 | 66.70 | 1.0 |
6 | 80.04 | 1.000000 | 80.04 | 1.0 |
7 | 93.38 | 1.000000 | 93.38 | 1.0 |
8 | 106.72 | 1.000000 | 106.72 | 1.0 |
9 | 120.06 | 1.000000 | 120.06 | 1.0 |
Finally, the regularized experimental variogram can be plotted with the following call:
glm.plot_semivariogram([mr])
(<Figure size 432x288 with 1 Axes>, <AxesSubplot:xlabel='Average distance', ylabel='Semivariance'>)
Weighted variography¶
Declustering weights can be applied for variography. Some preprocessing is needed before applying weights.
weights_path = "../../tests/data/variography/weights_dataset.csv"
# load dataset
data = pd.read_csv(weights_path)
data = data[['x', 'y', 'z', 'weights', 'n']]
data = data.dropna(axis=0)
data.head()
x | y | z | weights | n | |
---|---|---|---|---|---|
0 | 324227.80 | 313205.87 | 693.82 | 0.60 | 0.44 |
1 | 324252.41 | 313302.14 | 721.86 | 2.38 | 0.55 |
2 | 324292.15 | 313258.03 | 721.27 | 2.38 | 0.56 |
3 | 324302.00 | 313328.62 | 712.68 | 2.38 | 0.81 |
4 | 324311.92 | 313321.73 | 721.44 | 2.38 | 0.55 |
Weights can be extracted from the input dataframe and given as parameter for experimental variogram computation.
coords = data[['x', 'y', 'z']].to_numpy()
values = data[['n']].to_numpy()
weights = data[['weights']].to_numpy()
lags, tol = glm.generate_lags(lag=2.0, plag=50, nlags=10)
# compute weighted variogram
vario_weighted = glm.variogram(coords=coords,
values=values,
lags=lags,
tol=tol,
geographic_azimuth=0,
dip=0,
pitch=0,
atol=90,
slice_width=np.nan,
slice_height=np.nan,
method="semivariogram",
weights=weights)
# compute unweighted variogram
vario_unweighted = glm.variogram(coords=coords,
values=values,
lags=lags,
tol=tol,
geographic_azimuth=0,
dip=0,
pitch=0,
atol=90,
slice_width=np.nan,
slice_height=np.nan,
method="semivariogram",
weights=None)
We notice from the following experimental variogram outputs that the calculus retains the same number of pairs. But the weights give different experimental variogram data.
vario_weighted
rank | npairs | lag | avgdist | vario | indices | |
---|---|---|---|---|---|---|
0 | 1 | 121 | 2.0 | 2.45876 | 0.014602 | [[6, 7], [7, 8], [8, 9], [9, 10], [10, 11], [1... |
1 | 2 | 25 | 4.0 | 4.2656 | 0.040026 | [[11, 13], [19, 21], [25, 27], [31, 33], [36, ... |
2 | 3 | 82 | 6.0 | 5.19623 | 0.033027 | [[5, 78], [5, 79], [5, 80], [6, 8], [7, 9], [8... |
3 | 4 | 82 | 8.0 | 7.5396 | 0.041521 | [[5, 81], [6, 9], [7, 10], [8, 11], [9, 12], [... |
4 | 5 | 59 | 10.0 | 10.0227 | 0.037291 | [[5, 82], [6, 10], [7, 11], [8, 12], [9, 13], ... |
5 | 6 | 48 | 12.0 | 12.3549 | 0.025459 | [[0, 58], [0, 59], [0, 60], [0, 61], [0, 62], ... |
6 | 7 | 32 | 14.0 | 14.093 | 0.035080 | [[0, 56], [0, 57], [0, 63], [3, 4], [7, 13], [... |
7 | 8 | 45 | 16.0 | 15.5143 | 0.052427 | [[0, 55], [0, 64], [6, 12], [21, 23], [41, 92]... |
8 | 9 | 47 | 18.0 | 17.7994 | 0.037545 | [[0, 54], [1, 15], [5, 83], [5, 84], [6, 13], ... |
vario_unweighted
rank | npairs | lag | avgdist | vario | indices | |
---|---|---|---|---|---|---|
0 | 1 | 121 | 2.0 | 2.45876 | 0.014835 | [[6, 7], [7, 8], [8, 9], [9, 10], [10, 11], [1... |
1 | 2 | 25 | 4.0 | 4.2656 | 0.035550 | [[11, 13], [19, 21], [25, 27], [31, 33], [36, ... |
2 | 3 | 82 | 6.0 | 5.19623 | 0.022781 | [[5, 78], [5, 79], [5, 80], [6, 8], [7, 9], [8... |
3 | 4 | 82 | 8.0 | 7.5396 | 0.029737 | [[5, 81], [6, 9], [7, 10], [8, 11], [9, 12], [... |
4 | 5 | 59 | 10.0 | 10.0227 | 0.026775 | [[5, 82], [6, 10], [7, 11], [8, 12], [9, 13], ... |
5 | 6 | 48 | 12.0 | 12.3549 | 0.023192 | [[0, 58], [0, 59], [0, 60], [0, 61], [0, 62], ... |
6 | 7 | 32 | 14.0 | 14.093 | 0.034855 | [[0, 56], [0, 57], [0, 63], [3, 4], [7, 13], [... |
7 | 8 | 45 | 16.0 | 15.5143 | 0.035893 | [[0, 55], [0, 64], [6, 12], [21, 23], [41, 92]... |
8 | 9 | 47 | 18.0 | 17.7994 | 0.038454 | [[0, 54], [1, 15], [5, 83], [5, 84], [6, 13], ... |
Both experimental variograms can be plotted with the same previously seen functions.
vario_exp = [vario_weighted, vario_unweighted]
glm.plot_semivariogram(variograms=vario_exp, model=None)
(<Figure size 432x288 with 1 Axes>, <AxesSubplot:xlabel='Average distance', ylabel='Semivariance'>)