simulations¶
Simulations¶
Overview¶
The purpose of this notebook is to show how the geolime library can be used for simulating a random function Z. A regionalized random variable z is viewed as one realization of the random function Z(x). Simulation is introduced as a realization of a probabilistic model. A conditional simulation is a simulation that honors the available data. A non conditional simulation does infer directly on data, but on its covariance model. Both approaches will be covered in the following sections.
Dataset¶
The 2D Walker Lake dataset will be used for demonstration purposes.
Required libraries¶
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import geolime as glm
Dataset preprocessing¶
walker_lake_path = "../../tests/data/walker_dataset/walker.csv"
data = pd.read_csv(walker_lake_path, sep= ' ')
data.loc[data['v'] > 1e10, 'v'] = 0
data.head()
id | x | y | u | v | t | |
---|---|---|---|---|---|---|
0 | 1 | 11 | 8 | 0.0 | 0.0 | 2 |
1 | 2 | 8 | 30 | 0.0 | 0.0 | 2 |
2 | 3 | 9 | 48 | 224.4 | 0.0 | 2 |
3 | 4 | 8 | 69 | 434.4 | 0.0 | 2 |
4 | 5 | 9 | 90 | 412.1 | 0.0 | 2 |
sample_coords = data[['x', 'y']].to_numpy()
sample_u = data[['u']].to_numpy()
Objects declaration¶
cov_model = 29535 * glm.Nugget() + 31156 * glm.Spherical(2,angles=339,scales=[30,37]) + 35487 * glm.Spherical(2,angles=160,scales=[30,90])
target_grid = glm.create_grid_2d(origin=[0, 0], cell_size=[10, 10], n_dim=[26, 30], angle=[0, 0, 0])
pattern_smu = glm.GridPatternGenerator(target_grid, [5, 5])
Variography¶
The experimental variogram help us analyze the spatial structure of a regionalized variable z(x). It will be computed and fitted with a covariance model, which is the structure function of the random function to be analyzed.
As the samples are not Gaussian, it is required to retrieve the variogram model in the Gaussian space using an anamorphosis. This operation can be done using the following class object.
anam = glm.Anamorphosis(sample_u.flatten())
coeffs_point = anam.hermite_coefficients(500)
Experimental variogram is the computed along two azimuth directions with the following calls.
lags, tol = glm.generate_lags(lag=10, plag=50, nlags=10)
vario_gauss_exp_160 = glm.variogram(coords=sample_coords, values=anam.gauss_scores.reshape(470,1), lags=lags,
tol=tol, geographic_azimuth=160, dip=0, pitch=0,
atol=40, slice_width=np.nan, slice_height=np.nan,
method="semivariogram")
vario_gauss_exp_60 = glm.variogram(coords=sample_coords, values=anam.gauss_scores.reshape(470,1), lags=lags,
tol=tol, geographic_azimuth=60, dip=0, pitch=0,
atol=40, slice_width=np.nan, slice_height=np.nan,
method="semivariogram")
glm.plot_semivariogram(variograms=[vario_gauss_exp_160,vario_gauss_exp_60], model=None)
plt.xlabel('Lag (m)')
plt.ylabel('Variance (ppm^2))')
plt.title('Experimental variogram in Gaussian space');
Text(0.5, 1.0, 'Experimental variogram in Gaussian space')
A covariance model can now be fitted on both directions. Here we define the covariance model of choice.
cov_gauss = glm.Nugget() + 0.5 * glm.Spherical(2) + 0.5 * glm.Spherical(2)
cov_gauss_model = glm.model_fit(variograms=[vario_gauss_exp_160, vario_gauss_exp_60], cov=cov_gauss, constraints=None)
glm.plot_semivariogram(variograms=[vario_gauss_exp_160, vario_gauss_exp_60], model= cov_gauss_model);
(<Figure size 432x288 with 1 Axes>, <AxesSubplot:xlabel='Average distance', ylabel='Semivariance'>)
print(cov_gauss_model)
Model with 3 components Component 1 : Variance : 0.24281509823351183 Covariance type : Nugget Component 2 : Variance : 0.24939042694795258 Covariance type : Spherical Scales : [33.03373991 17.31740456] Angles : 35.4243588578317 Rotation matrix [[ 0.57962767 -0.81488145] [ 0.81488145 0.57962767]] Component 3 : Variance : 0.577185656821309 Covariance type : Spherical Scales : [89.99999453 40.29051323] Angles : 20.00192782357825 Rotation matrix [[ 0.34205176 -0.93968111] [ 0.93968111 0.34205176]] Total sill = 1.0693911820027733
Change of support coefficient is required so as to help the operator to evaluate the simulation on a block. Its computation relies on the previously adjusted covariance model in Gaussian space.
r = np.sqrt(glm.cvv(cov_gauss_model, pattern_smu))
Block anamorphosis coefficients can the be obtained by applying the following formula.
coeffs_block = coeffs_point * np.power(r, range(0, len(coeffs_point)))
Model regularization is now required so as to take into account the support effect (transition from point to block).
varioreg_160 = glm.model_regularize(vario_gauss_exp_160, cov_gauss_model, pattern_smu, nlag=11, normalize=True)
varioreg_60 = glm.model_regularize(vario_gauss_exp_60, cov_gauss_model, pattern_smu, nlag=11, normalize=True)
cov_gauss_block = glm.model_fit([varioreg_160, varioreg_60], glm.Spherical(2), constraints=[{'sill_fixed': 1.}])
neigh=glm.Neighborhood()
Once again, a model can be adjusted versus a Spherical covariance structure, and plotted along with the two directions of the regularized models.
glm.plot_semivariogram(variograms=[varioreg_160,varioreg_60], model = cov_gauss_block);
(<Figure size 432x288 with 1 Axes>, <AxesSubplot:xlabel='Average distance', ylabel='Semivariance'>)
Non conditional simulations¶
As stated in the overview, non conditional simulation doesn't require conditional assumptions on the observed data. It only requires a covariance model in order to be operated. geolime implements Turning Bands simulation technique, which generates unconditional simulation realizations based on isotropic covariance or variogram models. This method starts with the principle that it is easier to simulate along a line where the points are sequenced, rather than directly in 2D or 3D.
Turning Bands method is applied in two steps. Firstly, it assumes a Spherical covariance model as input and computes the spectral measure, derived from the Bochner theorem. Then, at each point in the dimensional space defined by the input grid, all the values from the bands will be summed to obtain a realization for each point.
Zs = glm.simu_grid_init(cov_gauss_block, target_grid, nbsimus=1, nbands=1000)
a = glm.simu_grid(Zs, target_grid)
glm.display_grid(target_grid, np.mean(a,axis=0), sample_coords, z=None, title="Non conditional simulation on a grid")
(<Figure size 432x288 with 2 Axes>, <AxesSubplot:title={'center':'Non conditional simulation on a grid'}>)
Unconditional simulations is generally considered to be limited, as the previous visual result does not provide a convincing output. The following section introduces kriging for conditional simulations on the observed data.
Conditional simulations¶
Conditional simulation of turning bands method generally requires post-processing by kriging. It is once again applied in two steps. Firstly the block simulation, that applies unconditional simulations in a similar fashion as stated in the previous section. But this steps also requires observed values as input in order to prepare effective kriging by grouping the data in common blocks. The following chunk of code prepares 100 simulations.
ZSimu, XcoordsC, nPointsInBlocks, SimuRes = glm.direct_block_simu_init(
target_grid, sample_coords, anam.gauss_scores.flatten(),
cov_gauss_block, r, nbands=1000, nbsimus=100)
Once this preliminary step has been cleared, simple kriging is then applied on the unconditional simulations. This process is effective in the Gaussian space.
Result = glm.direct_block_simulations_on_a_set(target_grid, neigh, XcoordsC,
cov_gauss_block, ZSimu, r,
SimuRes, nPointsInBlocks,
coeffs_block, indices=None)
The following plot displays the first simulation result.
ind = 0
glm.display_grid(target_grid, Result[ind, :], sample_coords, z=None,
title="Simu number " + str(ind));
(<Figure size 432x288 with 2 Axes>, <AxesSubplot:title={'center':'Simu number 0'}>)
Next plot shows the average of all the 100 computed simulations.
glm.display_grid(target_grid, np.mean(Result, 0), sample_coords,
title="Average over " + str(Result.shape[0]) + " simulations");
(<Figure size 432x288 with 2 Axes>, <AxesSubplot:title={'center':'Average over 100 simulations'}>)
We can also display the standard deviation against the previous mean.
glm.display_grid(target_grid, np.sqrt(np.var(Result, 0)), sample_coords, title="St-dev over " + str(Result.shape[0]) + " simulations");
Simulation results can be averaged against a cutoff value. The following chunk of code computes the probability of having a simulation result greater than a defined cutoff value.
cutoff = 600.
proba = np.mean(Result > cutoff, 0)
glm.display_grid(target_grid, proba, sample_coords,
title="Probability to be greater than " + str(cutoff) +
" (" + str(Result.shape[0]) + " simulations)");
(<Figure size 432x288 with 2 Axes>, <AxesSubplot:title={'center':'Probability to be greater than 600.0 (100 simulations)'}>)
This feature comes handy when estimating a proportion of metal tailored with an operator cutoff value.
glm.display_grid(target_grid, np.mean((Result > cutoff) * Result, 0),
sample_coords, title="Metal over " + str(cutoff) +
" (" + str(Result.shape[0]) + " simulation)");
(<Figure size 432x288 with 2 Axes>, <AxesSubplot:title={'center':'Metal over 600.0 (100 simulation)'}>)
glm.display_grid(target_grid, np.sqrt(proba * (1. - proba)), sample_coords,
title="Conditional standard deviation of indicator " + str(cutoff) +
" (" + str(Result.shape[0]) + " simulation)");
(<Figure size 432x288 with 2 Axes>, <AxesSubplot:title={'center':'Conditional standard deviation of indicator 600.0 (100 simulation)'}>)