variography¶

Variography¶

Data¶

Experimental semivariogram¶

Variogram model¶

Weighted variography¶

VarioMap and VarioContour¶

variography

In [1]:

%matplotlib inline

In [2]:

# loading packages
import pandas as pd
import numpy as np

geolime package can be imported as such:

In [3]:

import geolime as glm

A widely distributed dataset is available for testing. Here we load some data and display the first five rows.

In [4]:

walker_lake_path = "../../tests/data/walker_dataset/walker.csv"
data = pd.read_csv(walker_lake_path, sep= ' ')
data.head()

Out[4]:

Prepare arrays to be used as input for experimental semivariogram computation.

In [5]:

# extract coordinates and values
coords_df = data[['x', 'y']]
coords = coords_df.to_numpy()
values_df = data[['u']]
values = values_df.to_numpy()

Define input lags and compute the experimental semivariogram.

In [6]:

# define input lags
lags, tol = glm.generate_lags(lag=13.34, plag=50, nlags=10)

# compute variogram
vario_exp_az_north = glm.variogram(coords=coords, values=values, lags=lags,
                      tol=tol, geographic_azimuth=0, dip=0, pitch=0,
                      atol=45, slice_width=np.nan, slice_height=np.nan,
                      method="semivariogram")

# compute variogram
vario_exp_az_east = glm.variogram(coords=coords, values=values, lags=lags,
                      tol=tol, geographic_azimuth=90, dip=0, pitch=0,
                      atol=45, slice_width=np.nan, slice_height=np.nan,
                      method="semivariogram")

The resulting dataframe contains the following columns

In [7]:

vario_exp_az_north.head()

Out[7]:

	rank	npairs	lag	avgdist	vario	indices
0	0	9	0.00	3.54258	38432.551667	[[37, 354], [54, 371], [63, 375], [234, 420], ...
1	1	1247	13.34	13.4394	56585.788464	[[1, 2], [2, 202], [3, 200], [3, 401], [4, 200...
2	2	2319	26.68	26.7378	71857.304101	[[0, 1], [0, 16], [1, 17], [1, 202], [1, 400],...
3	3	2594	40.02	40.6949	85625.567637	[[0, 2], [0, 17], [1, 3], [1, 18], [1, 196], [...
4	4	3132	53.36	53.557	91548.275431	[[0, 32], [0, 198], [0, 201], [0, 202], [0, 39...

In [8]:

vario_exp_az_east.head()

Out[8]:

	rank	npairs	lag	avgdist	vario	indices
0	0	186	0.00	4.67994	38188.530618	[[4, 403], [18, 345], [19, 347], [19, 348], [2...
1	1	1253	13.34	14.6557	71891.404984	[[2, 17], [2, 400], [3, 196], [3, 202], [3, 34...
2	2	2005	26.68	27.0867	94645.106242	[[0, 15], [0, 209], [0, 212], [1, 15], [1, 16]...
3	3	2217	40.02	40.6145	94929.345555	[[0, 30], [0, 31], [0, 210], [0, 349], [0, 350...
4	4	2410	53.36	53.6488	98562.561701	[[0, 32], [0, 45], [0, 208], [0, 211], [0, 250...

In [9]:

vario_exp = [vario_exp_az_north, vario_exp_az_east]

It also contains useful metadata

In [10]:

vario_exp_az_north.attrs

Out[10]:

{'method': 'semivariogram',
 'geographic_azimuth': 0,
 'dip': 0,
 'pitch': 0,
 'tol': 6.67,
 'atol': 45.0,
 'trend': 90.0,
 'plunge': 0.0,
 'variable_mean': 435.2987234042554,
 'variable_var': 89738.0559132639,
 'number_of_lags': 10,
 'input_lags': array([  0.  ,  13.34,  26.68,  40.02,  53.36,  66.7 ,  80.04,  93.38,
        106.72, 120.06]),
 'slice_width': nan,
 'slice_height': nan,
 'dim': 2,
 'directional_vector': array([6.123234e-17, 1.000000e+00])}

We can plot the experimental semivariogram dataframe with the following command:

In [11]:

glm.plot_semivariogram(variograms=vario_exp, model=None, display_npairs=True)

Out[11]:

(<Figure size 432x288 with 1 Axes>,
 <AxesSubplot:xlabel='Average distance', ylabel='Semivariance'>)

A VarioMap and a VarioContour can also be achieved.

In [12]:

glm.vario_map(coords = coords,
                  values=values,
                  lags=np.arange(start=0, stop=150, step=15),
                  tol=50,
                  n_az=20,
                  atol=20)

In [13]:

glm.vario_contour(coords = coords,
                  values=values,
                  lags=np.arange(start=0, stop=150, step=15),
                  tol=50,
                  n_az=20,
                  atol=20)

Experimental semivariogram can be fitted the following way. We first need to define a target grid.

In [14]:

# create a grid
target_grid = glm.create_grid_2d(origin=[0, 0], n_dim=[30, 30], cell_size=[5, 5], angle=[0.0, 0.0, 0.0])

We also need to define a covariance model for fitting

In [15]:

# define a covariance model
cov = glm.Nugget() + 0.5 * glm.Spherical(2)

In [16]:

print(cov)

Model with 2 components

Component 1 :
Variance : 1
Covariance type : Nugget

Component 2 :
Variance : 0.5
Covariance type : Spherical
Scales : [1. 1.]
Angles : 0.0
Rotation matrix
 [[ 6.123234e-17 -1.000000e+00]
 [ 1.000000e+00  6.123234e-17]]

Total sill = 1.5

We can fit the experimental variogram

In [17]:

# fit model
cov_model = glm.model_fit(variograms=[vario_exp_az_north, vario_exp_az_east], cov=cov, constraints=None)

In [18]:

print(cov_model)

Model with 2 components

Component 1 :
Variance : 38554.06094994866
Covariance type : Nugget

Component 2 :
Variance : 56089.38951761728
Covariance type : Spherical
Scales : [66.4834731  30.59535845]
Angles : 10.424245153273477
Rotation matrix
 [[ 0.18093533 -0.98349499]
 [ 0.98349499  0.18093533]]

Total sill = 94643.45046756594

In [19]:

glm.plot_semivariogram([vario_exp_az_north, vario_exp_az_east],
                       model=cov_model,
                       model_angles=[{"azi":0.,"dip":0.,"pitch":0.,"label":"N00"}],
                       display_npairs=True)

Out[19]:

(<Figure size 432x288 with 1 Axes>,
 <AxesSubplot:xlabel='Average distance', ylabel='Semivariance'>)

In [20]:

print(cov_model)

Model with 2 components

Component 1 :
Variance : 38554.06094994866
Covariance type : Nugget

Component 2 :
Variance : 56089.38951761728
Covariance type : Spherical
Scales : [66.4834731  30.59535845]
Angles : 10.424245153273477
Rotation matrix
 [[ 0.18093533 -0.98349499]
 [ 0.98349499  0.18093533]]

Total sill = 94643.45046756594

We apply model regularization with the following call:

In [21]:

pattern_generator = glm.GridPatternGenerator(target_grid, [5, 5])
mr = glm.model_regularize(vario_exp_az_north, cov, pattern_generator, nlag=10, border_ratio=1, normalize=True)
mr

Out[21]:

	lag	vario	avgdist	npairs
0	0.00	0.000000	0.00	1.0
1	13.34	0.251511	13.34	1.0
2	26.68	0.559844	26.68	1.0
3	40.02	0.810363	40.02	1.0
4	53.36	0.963194	53.36	1.0
5	66.70	0.999988	66.70	1.0
6	80.04	1.000000	80.04	1.0
7	93.38	1.000000	93.38	1.0
8	106.72	1.000000	106.72	1.0
9	120.06	1.000000	120.06	1.0

Finally, the regularized experimental variogram can be plotted with the following call:

In [22]:

glm.plot_semivariogram([mr])

Out[22]:

(<Figure size 432x288 with 1 Axes>,
 <AxesSubplot:xlabel='Average distance', ylabel='Semivariance'>)

1.000000e+31

1.000000e+31

1.000000e+31

1.000000e+31

1.000000e+31

Declustering weights can be applied for variography. Some preprocessing is needed before applying weights.

In [23]:

weights_path = "../../tests/data/variography/weights_dataset.csv"
# load dataset
data = pd.read_csv(weights_path)
data = data[['x', 'y', 'z', 'weights', 'n']]
data = data.dropna(axis=0)
data.head()

Out[23]:

	x	y	z	weights	n
0	324227.80	313205.87	693.82	0.60	0.44
1	324252.41	313302.14	721.86	2.38	0.55
2	324292.15	313258.03	721.27	2.38	0.56
3	324302.00	313328.62	712.68	2.38	0.81
4	324311.92	313321.73	721.44	2.38	0.55

Weights can be extracted from the input dataframe and given as parameter for experimental variogram computation.

In [24]:

coords  = data[['x', 'y', 'z']].to_numpy()
values  = data[['n']].to_numpy()
weights = data[['weights']].to_numpy()

lags, tol = glm.generate_lags(lag=2.0, plag=50, nlags=10)

# compute weighted variogram
vario_weighted = glm.variogram(coords=coords,
                               values=values,
                               lags=lags,
                               tol=tol,
                               geographic_azimuth=0,
                               dip=0,
                               pitch=0,
                               atol=90,
                               slice_width=np.nan,
                               slice_height=np.nan,
                               method="semivariogram",
                               weights=weights)

# compute unweighted variogram
vario_unweighted = glm.variogram(coords=coords,
                                 values=values,
                                 lags=lags,
                                 tol=tol,
                                 geographic_azimuth=0,
                                 dip=0,
                                 pitch=0,
                                 atol=90,
                                 slice_width=np.nan,
                                 slice_height=np.nan,
                                 method="semivariogram",
                                 weights=None)

We notice from the following experimental variogram outputs that the calculus retains the same number of pairs. But the weights give different experimental variogram data.

In [25]:

vario_weighted

Out[25]:

	rank	npairs	lag	avgdist	vario	indices
0	1	121	2.0	2.45876	0.014602	[[6, 7], [7, 8], [8, 9], [9, 10], [10, 11], [1...
1	2	25	4.0	4.2656	0.040026	[[11, 13], [19, 21], [25, 27], [31, 33], [36, ...
2	3	82	6.0	5.19623	0.033027	[[5, 78], [5, 79], [5, 80], [6, 8], [7, 9], [8...
3	4	82	8.0	7.5396	0.041521	[[5, 81], [6, 9], [7, 10], [8, 11], [9, 12], [...
4	5	59	10.0	10.0227	0.037291	[[5, 82], [6, 10], [7, 11], [8, 12], [9, 13], ...
5	6	48	12.0	12.3549	0.025459	[[0, 58], [0, 59], [0, 60], [0, 61], [0, 62], ...
6	7	32	14.0	14.093	0.035080	[[0, 56], [0, 57], [0, 63], [3, 4], [7, 13], [...
7	8	45	16.0	15.5143	0.052427	[[0, 55], [0, 64], [6, 12], [21, 23], [41, 92]...
8	9	47	18.0	17.7994	0.037545	[[0, 54], [1, 15], [5, 83], [5, 84], [6, 13], ...

In [26]:

vario_unweighted

Out[26]:

	rank	npairs	lag	avgdist	vario	indices
0	1	121	2.0	2.45876	0.014835	[[6, 7], [7, 8], [8, 9], [9, 10], [10, 11], [1...
1	2	25	4.0	4.2656	0.035550	[[11, 13], [19, 21], [25, 27], [31, 33], [36, ...
2	3	82	6.0	5.19623	0.022781	[[5, 78], [5, 79], [5, 80], [6, 8], [7, 9], [8...
3	4	82	8.0	7.5396	0.029737	[[5, 81], [6, 9], [7, 10], [8, 11], [9, 12], [...
4	5	59	10.0	10.0227	0.026775	[[5, 82], [6, 10], [7, 11], [8, 12], [9, 13], ...
5	6	48	12.0	12.3549	0.023192	[[0, 58], [0, 59], [0, 60], [0, 61], [0, 62], ...
6	7	32	14.0	14.093	0.034855	[[0, 56], [0, 57], [0, 63], [3, 4], [7, 13], [...
7	8	45	16.0	15.5143	0.035893	[[0, 55], [0, 64], [6, 12], [21, 23], [41, 92]...
8	9	47	18.0	17.7994	0.038454	[[0, 54], [1, 15], [5, 83], [5, 84], [6, 13], ...

Both experimental variograms can be plotted with the same previously seen functions.

In [27]:

vario_exp = [vario_weighted, vario_unweighted]
glm.plot_semivariogram(variograms=vario_exp, model=None)

Out[27]:

(<Figure size 432x288 with 1 Axes>,
 <AxesSubplot:xlabel='Average distance', ylabel='Semivariance'>)