Simple kriging and ordinary kriging#

Here we implement simple kriging and ordinary kriging for a case study using variogram parameters from the Variogram Model example.

Kriging is a deterministic interpolation algorithm, where the goal is to minimize estimation variance, or optimize accuracy. Each interpolated value is the weighted sum of neighboring measurements. These weights are determined using the variogram so that the spatial structure of the data is accounted for. Each interpolated value \(Z^*\) at a location \(u\) is the weighted sum of neighboring measurements:

\[ Z^*(u_0) = \sum_{\alpha}^{N} \lambda_{\alpha} Z(u_{\alpha}), \]

where \(\lambda_{\alpha}\) are the weights on the \(N\) data points. These weights account for the variability of the measurements, their proximity to each other and the node being estimated, and the redundancy between nearby measurements. The variogram is also used to compute the uncertainty, or variance, at each location.

\[ \sigma_E^2(u_0) = C(0) - \sum_{\alpha}^{N} \lambda_{\alpha} C(u_0-u_{\alpha}). \]
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1 import make_axes_locatable
from matplotlib.colors import LightSource
from sklearn.preprocessing import QuantileTransformer
import skgstat as skg
from skgstat import models
import gstatsim as gs

Load and plot data#

df_bed = pd.read_csv('data/greenland_test_data.csv')

# remove erroneously high values due to bad bed picks
df_bed = df_bed[df_bed["Bed"] <= 700]  

# plot data
fig = plt.figure(figsize = (5,5))
ax = plt.gca()
im = ax.scatter(df_bed['X'], df_bed['Y'], c=df_bed['Bed'], vmin=-400, vmax=600, 
                     marker='.', s=0.5, cmap='gist_earth')
plt.title('Subglacial topography data')
plt.xlabel('X [m]'); plt.ylabel('Y [m]')
plt.locator_params(nbins=5)
plt.axis('scaled')

# make colorbar
divider = make_axes_locatable(ax)
cax = divider.append_axes('right', size='5%', pad=0.1)
cbar = plt.colorbar(im, ticks=np.linspace(-400, 600, 11), cax=cax)
cbar.set_label("Bed elevation [m]", rotation=270, labelpad=15)
plt.show()
_images/3_Simple_kriging_and_ordinary_kriging_3_0.png

Grid and transform data. Get variogram parameters#

See variogram tutorials for details

# grid data to 100 m resolution and remove coordinates with NaNs
res = 1000
df_grid, grid_matrix, rows, cols = gs.Gridding.grid_data(df_bed, 'X', 'Y', 'Bed', res)
df_grid = df_grid[df_grid["Z"].isnull() == False]
df_grid = df_grid.rename(columns = {"Z": "Bed"})

# normal score transformation
data = df_grid['Bed'].values.reshape(-1,1)
nst_trans = QuantileTransformer(n_quantiles=500, output_distribution="normal").fit(data)
df_grid['Nbed'] = nst_trans.transform(data) 

# compute experimental (isotropic) variogram
coords = df_grid[['X','Y']].values
values = df_grid['Nbed']

maxlag = 50000             # maximum range distance
n_lags = 70                # num of bins

V1 = skg.Variogram(coords, values, bin_func='even', n_lags=n_lags, 
                   maxlag=maxlag, normalize=False)

# use exponential variogram model
V1.model = 'exponential'
V1.parameters
[31852.813443811916, 0.7027482245687205, 0]

These outputs are the variogram range, sill, and nugget, respectively. We will need this information for kriging.

Initialize grid#

First we need to define a grid to interpolate. This stores an array of coordinates for the simulation.

# define coordinate grid
xmin = np.min(df_grid['X']); xmax = np.max(df_grid['X'])     # min and max x values
ymin = np.min(df_grid['Y']); ymax = np.max(df_grid['Y'])     # min and max y values

Pred_grid_xy = gs.Gridding.prediction_grid(xmin, xmax, ymin, ymax, res)

Simple kriging#

Simple kriging assumes that the mean is stationary and known. The data is treated as a residual from the mean, which is computed as the average of conditioning points. Sometimes this is a good assumption, but in some cases it is not. We will apply it here to see what happens.

# set variogram parameters
azimuth = 0
nugget = V1.parameters[2]

# the major and minor ranges are the same in this example because it is isotropic
major_range = V1.parameters[0]
minor_range = V1.parameters[0]
sill = V1.parameters[1]
vtype = 'Exponential'

# save variogram parameters as a list
vario = [azimuth, nugget, major_range, minor_range, sill, vtype]


k = 100         # number of neighboring data points used to estimate a given point
rad = 50000     # 50 km search radius

# est_SK is the estimate and var_SK is the variance
est_SK, var_SK = gs.Interpolation.skrige(Pred_grid_xy, df_grid, 'X', 'Y', 'Nbed', k, vario, rad)
  0%|                                                 | 0/22500 [00:00<?, ?it/s]
  0%|                                       | 17/22500 [00:00<02:12, 169.66it/s]
  0%|                                       | 34/22500 [00:00<02:19, 160.60it/s]
  0%|                                       | 51/22500 [00:00<02:18, 162.07it/s]
  0%|                                       | 68/22500 [00:00<02:24, 155.30it/s]
  0%|▏                                      | 85/22500 [00:00<02:20, 159.60it/s]
  0%|▏                                     | 105/22500 [00:00<02:10, 171.13it/s]
  1%|▏                                     | 123/22500 [00:00<02:15, 165.23it/s]
  1%|▏                                     | 140/22500 [00:00<02:17, 163.03it/s]
  1%|▎                                     | 161/22500 [00:00<02:08, 174.01it/s]
  1%|▎                                     | 180/22500 [00:01<02:05, 178.03it/s]
  1%|▎                                     | 198/22500 [00:01<02:14, 165.70it/s]
  1%|▎                                     | 215/22500 [00:01<02:24, 154.43it/s]
  1%|▍                                     | 231/22500 [00:01<02:23, 155.43it/s]
  1%|▍                                     | 248/22500 [00:01<02:22, 156.68it/s]
  1%|▍                                     | 264/22500 [00:01<02:25, 153.12it/s]
  1%|▍                                     | 280/22500 [00:01<02:28, 149.54it/s]
  1%|▍                                     | 296/22500 [00:01<02:28, 149.15it/s]
  1%|▌                                     | 313/22500 [00:01<02:24, 153.83it/s]
  1%|▌                                     | 329/22500 [00:02<02:22, 155.55it/s]
  2%|▌                                     | 345/22500 [00:02<02:27, 150.42it/s]
  2%|▌                                     | 361/22500 [00:02<02:34, 143.43it/s]
  2%|▋                                     | 376/22500 [00:02<02:37, 140.24it/s]
  2%|▋                                     | 392/22500 [00:02<02:34, 142.86it/s]
  2%|▋                                     | 409/22500 [00:02<02:26, 150.30it/s]
  2%|▋                                     | 425/22500 [00:02<02:33, 143.79it/s]
  2%|▋                                     | 440/22500 [00:02<02:35, 141.67it/s]
  2%|▊                                     | 456/22500 [00:02<02:30, 146.41it/s]
  2%|▊                                     | 473/22500 [00:03<02:26, 149.90it/s]
  2%|▊                                     | 489/22500 [00:03<02:27, 148.98it/s]
  2%|▊                                     | 504/22500 [00:03<02:33, 143.00it/s]
  2%|▉                                     | 519/22500 [00:03<02:38, 138.71it/s]
  2%|▉                                     | 534/22500 [00:03<02:35, 141.20it/s]
  2%|▉                                     | 549/22500 [00:03<02:32, 143.57it/s]
  3%|▉                                     | 564/22500 [00:03<02:32, 143.52it/s]
  3%|▉                                     | 579/22500 [00:03<02:36, 140.39it/s]
  3%|█                                     | 594/22500 [00:03<02:38, 137.98it/s]
  3%|█                                     | 610/22500 [00:04<02:35, 141.19it/s]
  3%|█                                     | 625/22500 [00:04<02:37, 139.30it/s]
  3%|█                                     | 640/22500 [00:04<02:34, 141.73it/s]
  3%|█                                     | 655/22500 [00:04<02:39, 137.38it/s]
  3%|█▏                                    | 669/22500 [00:04<02:41, 135.08it/s]
  3%|█▏                                    | 684/22500 [00:04<02:37, 138.84it/s]
  3%|█▏                                    | 698/22500 [00:04<02:37, 138.50it/s]
  3%|█▏                                    | 714/22500 [00:04<02:33, 141.87it/s]
  3%|█▏                                    | 729/22500 [00:04<02:35, 140.08it/s]
  3%|█▎                                    | 744/22500 [00:05<02:35, 139.48it/s]
  3%|█▎                                    | 759/22500 [00:05<02:32, 142.38it/s]
  3%|█▎                                    | 774/22500 [00:05<02:32, 142.87it/s]
  4%|█▎                                    | 789/22500 [00:05<02:31, 143.76it/s]
  4%|█▎                                    | 804/22500 [00:05<02:36, 139.02it/s]
  4%|█▍                                    | 818/22500 [00:05<02:40, 135.43it/s]
  4%|█▍                                    | 833/22500 [00:05<02:38, 137.09it/s]
  4%|█▍                                    | 847/22500 [00:05<02:42, 133.26it/s]
  4%|█▍                                    | 864/22500 [00:05<02:31, 143.09it/s]
  4%|█▍                                    | 879/22500 [00:05<02:34, 139.82it/s]
  4%|█▌                                    | 894/22500 [00:06<02:36, 137.68it/s]
  4%|█▌                                    | 911/22500 [00:06<02:30, 143.60it/s]
  4%|█▌                                    | 928/22500 [00:06<02:25, 148.39it/s]
  4%|█▌                                    | 943/22500 [00:06<02:30, 143.55it/s]
  4%|█▌                                    | 958/22500 [00:06<02:34, 139.06it/s]
  4%|█▋                                    | 972/22500 [00:06<02:38, 135.88it/s]
  4%|█▋                                    | 986/22500 [00:06<02:38, 135.80it/s]
  4%|█▋                                   | 1001/22500 [00:06<02:36, 136.98it/s]
  5%|█▋                                   | 1015/22500 [00:06<02:37, 136.74it/s]
  5%|█▋                                   | 1029/22500 [00:07<02:40, 134.19it/s]
  5%|█▋                                   | 1043/22500 [00:07<02:40, 133.44it/s]
  5%|█▋                                   | 1060/22500 [00:07<02:29, 143.51it/s]
  5%|█▊                                   | 1075/22500 [00:07<02:33, 139.16it/s]
  5%|█▊                                   | 1089/22500 [00:07<02:35, 137.71it/s]
  5%|█▊                                   | 1103/22500 [00:07<02:39, 134.32it/s]
  5%|█▊                                   | 1117/22500 [00:07<02:45, 129.39it/s]
  5%|█▊                                   | 1132/22500 [00:07<02:40, 132.79it/s]
  5%|█▉                                   | 1146/22500 [00:07<02:45, 129.19it/s]
  5%|█▉                                   | 1163/22500 [00:08<02:32, 139.63it/s]
  5%|█▉                                   | 1178/22500 [00:08<02:38, 134.26it/s]
  5%|█▉                                   | 1192/22500 [00:08<02:41, 131.56it/s]
  5%|█▉                                   | 1210/22500 [00:08<02:27, 144.57it/s]
  5%|██                                   | 1225/22500 [00:08<02:34, 137.46it/s]
  6%|██                                   | 1239/22500 [00:08<02:36, 135.59it/s]
  6%|██                                   | 1253/22500 [00:08<02:42, 130.45it/s]
  6%|██                                   | 1267/22500 [00:08<02:45, 128.38it/s]
  6%|██                                   | 1282/22500 [00:08<02:41, 131.51it/s]
  6%|██▏                                  | 1296/22500 [00:09<02:47, 126.75it/s]
  6%|██▏                                  | 1310/22500 [00:09<02:46, 127.65it/s]
  6%|██▏                                  | 1323/22500 [00:09<02:51, 123.51it/s]
  6%|██▏                                  | 1336/22500 [00:09<02:54, 120.98it/s]
  6%|██▏                                  | 1354/22500 [00:09<02:34, 136.60it/s]
  6%|██▏                                  | 1368/22500 [00:09<02:34, 136.67it/s]
  6%|██▎                                  | 1382/22500 [00:09<02:45, 127.53it/s]
  6%|██▎                                  | 1395/22500 [00:09<02:49, 124.70it/s]
  6%|██▎                                  | 1408/22500 [00:09<02:47, 125.71it/s]
  6%|██▎                                  | 1422/22500 [00:10<02:44, 127.89it/s]
  6%|██▎                                  | 1435/22500 [00:10<02:44, 127.80it/s]
  6%|██▍                                  | 1448/22500 [00:10<02:52, 122.19it/s]
  6%|██▍                                  | 1462/22500 [00:10<02:47, 125.52it/s]
  7%|██▍                                  | 1475/22500 [00:10<03:01, 115.57it/s]
  7%|██▍                                  | 1488/22500 [00:10<02:59, 117.13it/s]
  7%|██▍                                  | 1507/22500 [00:10<02:35, 135.39it/s]
  7%|██▌                                  | 1521/22500 [00:10<02:39, 131.46it/s]
  7%|██▌                                  | 1535/22500 [00:10<02:50, 123.23it/s]
  7%|██▌                                  | 1549/22500 [00:11<02:46, 125.85it/s]
  7%|██▌                                  | 1562/22500 [00:11<02:47, 124.87it/s]
  7%|██▌                                  | 1575/22500 [00:11<02:49, 123.09it/s]
  7%|██▌                                  | 1588/22500 [00:11<02:48, 123.93it/s]
  7%|██▋                                  | 1601/22500 [00:11<02:54, 120.03it/s]
  7%|██▋                                  | 1614/22500 [00:11<02:51, 121.91it/s]
  7%|██▋                                  | 1627/22500 [00:11<02:57, 117.34it/s]
  7%|██▋                                  | 1642/22500 [00:11<02:45, 125.99it/s]
  7%|██▋                                  | 1657/22500 [00:11<02:39, 130.81it/s]
  7%|██▋                                  | 1671/22500 [00:12<02:50, 122.21it/s]
  7%|██▊                                  | 1684/22500 [00:12<02:58, 116.38it/s]
  8%|██▊                                  | 1698/22500 [00:12<02:50, 122.35it/s]
  8%|██▊                                  | 1711/22500 [00:12<02:56, 117.91it/s]
  8%|██▊                                  | 1723/22500 [00:12<02:58, 116.42it/s]
  8%|██▊                                  | 1736/22500 [00:12<02:56, 117.52it/s]
  8%|██▊                                  | 1748/22500 [00:12<03:04, 112.35it/s]
  8%|██▉                                  | 1760/22500 [00:12<03:05, 111.78it/s]
  8%|██▉                                  | 1772/22500 [00:12<03:04, 112.63it/s]
  8%|██▉                                  | 1788/22500 [00:13<02:47, 123.73it/s]
  8%|██▉                                  | 1803/22500 [00:13<02:38, 130.95it/s]
  8%|██▉                                  | 1817/22500 [00:13<02:36, 132.31it/s]
  8%|███                                  | 1831/22500 [00:13<02:49, 122.05it/s]
  8%|███                                  | 1844/22500 [00:13<02:54, 118.59it/s]
  8%|███                                  | 1857/22500 [00:13<02:56, 117.07it/s]
  8%|███                                  | 1869/22500 [00:13<02:59, 115.09it/s]
  8%|███                                  | 1882/22500 [00:13<02:56, 116.92it/s]
  8%|███                                  | 1894/22500 [00:13<03:04, 111.42it/s]
  8%|███▏                                 | 1908/22500 [00:14<02:57, 116.09it/s]
  9%|███▏                                 | 1920/22500 [00:14<03:00, 114.20it/s]
  9%|███▏                                 | 1932/22500 [00:14<02:59, 114.63it/s]
  9%|███▏                                 | 1947/22500 [00:14<02:45, 124.54it/s]
  9%|███▏                                 | 1961/22500 [00:14<02:40, 128.20it/s]
  9%|███▏                                 | 1974/22500 [00:14<02:47, 122.73it/s]
  9%|███▎                                 | 1987/22500 [00:14<02:52, 119.03it/s]
  9%|███▎                                 | 1999/22500 [00:14<02:56, 116.41it/s]
  9%|███▎                                 | 2011/22500 [00:14<02:57, 115.25it/s]
  9%|███▎                                 | 2023/22500 [00:15<03:03, 111.46it/s]
  9%|███▎                                 | 2035/22500 [00:15<03:02, 112.11it/s]
  9%|███▎                                 | 2047/22500 [00:15<03:11, 106.78it/s]
  9%|███▍                                 | 2060/22500 [00:15<03:03, 111.31it/s]
  9%|███▍                                 | 2072/22500 [00:15<03:03, 111.33it/s]
  9%|███▍                                 | 2088/22500 [00:15<02:46, 122.77it/s]
  9%|███▍                                 | 2103/22500 [00:15<02:38, 128.59it/s]
  9%|███▍                                 | 2116/22500 [00:15<02:41, 126.58it/s]
  9%|███▌                                 | 2129/22500 [00:15<02:48, 120.73it/s]
 10%|███▌                                 | 2142/22500 [00:16<02:55, 116.29it/s]
 10%|███▌                                 | 2154/22500 [00:16<03:00, 112.85it/s]
 10%|███▌                                 | 2166/22500 [00:16<03:02, 111.40it/s]
 10%|███▌                                 | 2178/22500 [00:16<03:09, 107.35it/s]
 10%|███▌                                 | 2189/22500 [00:16<03:10, 106.90it/s]
 10%|███▌                                 | 2200/22500 [00:16<03:13, 105.04it/s]
 10%|███▋                                 | 2213/22500 [00:16<03:01, 111.89it/s]
 10%|███▋                                 | 2225/22500 [00:16<03:00, 112.09it/s]
 10%|███▋                                 | 2239/22500 [00:16<02:51, 118.35it/s]
 10%|███▋                                 | 2255/22500 [00:17<02:37, 128.47it/s]
 10%|███▋                                 | 2268/22500 [00:17<02:39, 126.64it/s]
 10%|███▊                                 | 2281/22500 [00:17<02:52, 117.01it/s]
 10%|███▊                                 | 2293/22500 [00:17<02:56, 114.60it/s]
 10%|███▊                                 | 2305/22500 [00:17<03:00, 111.73it/s]
 10%|███▊                                 | 2317/22500 [00:17<03:04, 109.44it/s]
 10%|███▊                                 | 2328/22500 [00:17<03:08, 106.86it/s]
 10%|███▊                                 | 2339/22500 [00:17<03:10, 106.07it/s]
 10%|███▊                                 | 2350/22500 [00:17<03:12, 104.52it/s]
 11%|███▉                                 | 2365/22500 [00:18<02:54, 115.21it/s]
 11%|███▉                                 | 2380/22500 [00:18<02:41, 124.90it/s]
 11%|███▉                                 | 2393/22500 [00:18<02:47, 120.23it/s]
 11%|███▉                                 | 2411/22500 [00:18<02:27, 136.38it/s]
 11%|███▉                                 | 2425/22500 [00:18<02:40, 124.98it/s]
 11%|████                                 | 2438/22500 [00:18<02:42, 123.64it/s]
 11%|████                                 | 2451/22500 [00:18<02:53, 115.42it/s]
 11%|████                                 | 2463/22500 [00:18<02:58, 112.00it/s]
 11%|████                                 | 2475/22500 [00:18<03:04, 108.57it/s]
 11%|████                                 | 2487/22500 [00:19<03:04, 108.61it/s]
 11%|████                                 | 2498/22500 [00:19<03:09, 105.50it/s]
 11%|████▏                                | 2510/22500 [00:19<03:04, 108.53it/s]
 11%|████▏                                | 2521/22500 [00:19<03:03, 108.65it/s]
 11%|████▏                                | 2535/22500 [00:19<02:50, 117.06it/s]
 11%|████▏                                | 2552/22500 [00:19<02:33, 130.30it/s]
 11%|████▏                                | 2566/22500 [00:19<02:39, 124.74it/s]
 11%|████▏                                | 2579/22500 [00:19<02:47, 119.06it/s]
 12%|████▎                                | 2592/22500 [00:19<02:57, 112.02it/s]
 12%|████▎                                | 2604/22500 [00:20<03:02, 108.98it/s]
 12%|████▎                                | 2615/22500 [00:20<03:07, 106.04it/s]
 12%|████▎                                | 2626/22500 [00:20<03:10, 104.36it/s]
 12%|████▎                                | 2638/22500 [00:20<03:08, 105.59it/s]
 12%|████▎                                | 2649/22500 [00:20<03:14, 102.25it/s]
 12%|████▍                                | 2663/22500 [00:20<02:58, 111.19it/s]
 12%|████▍                                | 2677/22500 [00:20<02:46, 119.02it/s]
 12%|████▍                                | 2690/22500 [00:20<02:52, 115.10it/s]
 12%|████▍                                | 2709/22500 [00:20<02:26, 134.76it/s]
 12%|████▍                                | 2723/22500 [00:21<02:44, 120.38it/s]
 12%|████▍                                | 2736/22500 [00:21<02:50, 115.97it/s]
 12%|████▌                                | 2748/22500 [00:21<02:58, 110.51it/s]
 12%|████▌                                | 2760/22500 [00:21<03:03, 107.33it/s]
 12%|████▌                                | 2771/22500 [00:21<03:07, 105.10it/s]
 12%|████▌                                | 2782/22500 [00:21<03:05, 106.38it/s]
 12%|████▌                                | 2793/22500 [00:21<03:08, 104.41it/s]
 12%|████▋                                 | 2804/22500 [00:21<03:18, 99.37it/s]
 13%|████▋                                | 2821/22500 [00:22<02:50, 115.70it/s]
 13%|████▋                                | 2833/22500 [00:22<02:55, 111.97it/s]
 13%|████▋                                | 2849/22500 [00:22<02:38, 123.88it/s]
 13%|████▋                                | 2863/22500 [00:22<02:38, 124.12it/s]
 13%|████▋                                | 2876/22500 [00:22<02:40, 122.17it/s]
 13%|████▊                                | 2889/22500 [00:22<02:43, 119.67it/s]
 13%|████▊                                | 2902/22500 [00:22<02:53, 113.02it/s]
 13%|████▊                                | 2914/22500 [00:22<03:01, 107.64it/s]
 13%|████▊                                | 2925/22500 [00:22<03:05, 105.35it/s]
 13%|████▊                                | 2936/22500 [00:23<03:03, 106.50it/s]
 13%|████▊                                | 2947/22500 [00:23<03:11, 102.23it/s]
 13%|████▊                                | 2958/22500 [00:23<03:13, 101.19it/s]
 13%|████▉                                | 2974/22500 [00:23<02:49, 115.42it/s]
 13%|████▉                                | 2986/22500 [00:23<02:48, 115.47it/s]
 13%|████▉                                | 3003/22500 [00:23<02:29, 130.77it/s]
 13%|████▉                                | 3017/22500 [00:23<02:31, 128.37it/s]
 13%|████▉                                | 3030/22500 [00:23<02:47, 116.57it/s]
 14%|█████                                | 3042/22500 [00:23<02:51, 113.76it/s]
 14%|█████                                | 3054/22500 [00:24<02:59, 108.45it/s]
 14%|█████                                | 3065/22500 [00:24<03:06, 104.21it/s]
 14%|█████                                | 3076/22500 [00:24<03:12, 101.10it/s]
 14%|█████                                | 3087/22500 [00:24<03:09, 102.53it/s]
 14%|█████                                | 3098/22500 [00:24<03:11, 101.29it/s]
 14%|█████                                | 3110/22500 [00:24<03:06, 104.11it/s]
 14%|█████▏                               | 3124/22500 [00:24<02:50, 113.34it/s]
 14%|█████▏                               | 3137/22500 [00:24<02:44, 117.86it/s]
 14%|█████▏                               | 3150/22500 [00:24<02:40, 120.24it/s]
 14%|█████▏                               | 3167/22500 [00:25<02:25, 132.78it/s]
 14%|█████▏                               | 3181/22500 [00:25<02:43, 118.26it/s]
 14%|█████▎                               | 3194/22500 [00:25<02:48, 114.92it/s]
 14%|█████▎                               | 3206/22500 [00:25<02:53, 111.13it/s]
 14%|█████▎                               | 3218/22500 [00:25<02:54, 110.46it/s]
 14%|█████▎                               | 3230/22500 [00:25<02:54, 110.19it/s]
 14%|█████▎                               | 3242/22500 [00:25<02:56, 108.94it/s]
 14%|█████▎                               | 3253/22500 [00:25<03:06, 103.22it/s]
 15%|█████▎                               | 3267/22500 [00:26<02:50, 112.72it/s]
 15%|█████▍                               | 3279/22500 [00:26<02:55, 109.67it/s]
 15%|█████▍                               | 3295/22500 [00:26<02:38, 121.17it/s]
 15%|█████▍                               | 3309/22500 [00:26<02:34, 124.40it/s]
 15%|█████▍                               | 3323/22500 [00:26<02:32, 125.45it/s]
 15%|█████▍                               | 3336/22500 [00:26<02:43, 117.40it/s]
 15%|█████▌                               | 3348/22500 [00:26<02:48, 113.61it/s]
 15%|█████▌                               | 3360/22500 [00:26<02:57, 107.85it/s]
 15%|█████▌                               | 3372/22500 [00:26<02:56, 108.47it/s]
 15%|█████▌                               | 3383/22500 [00:27<02:57, 107.78it/s]
 15%|█████▌                               | 3394/22500 [00:27<03:02, 104.50it/s]
 15%|█████▊                                | 3405/22500 [00:27<03:11, 99.89it/s]
 15%|█████▌                               | 3420/22500 [00:27<02:53, 110.20it/s]
 15%|█████▋                               | 3434/22500 [00:27<02:42, 117.04it/s]
 15%|█████▋                               | 3446/22500 [00:27<02:42, 117.23it/s]
 15%|█████▋                               | 3469/22500 [00:27<02:11, 145.16it/s]
 15%|█████▋                               | 3484/22500 [00:27<02:30, 126.20it/s]
 16%|█████▊                               | 3498/22500 [00:27<02:35, 122.45it/s]
 16%|█████▊                               | 3511/22500 [00:28<02:48, 112.66it/s]
 16%|█████▊                               | 3523/22500 [00:28<02:55, 108.04it/s]
 16%|█████▊                               | 3535/22500 [00:28<02:54, 108.84it/s]
 16%|█████▊                               | 3547/22500 [00:28<02:59, 105.34it/s]
 16%|█████▊                               | 3563/22500 [00:28<02:38, 119.55it/s]
 16%|█████▉                               | 3576/22500 [00:28<02:43, 115.66it/s]
 16%|█████▉                               | 3590/22500 [00:28<02:35, 121.37it/s]
 16%|█████▉                               | 3604/22500 [00:28<02:30, 125.62it/s]
 16%|█████▉                               | 3617/22500 [00:29<02:42, 116.30it/s]
 16%|█████▉                               | 3629/22500 [00:29<02:50, 110.99it/s]
 16%|█████▉                               | 3641/22500 [00:29<02:53, 108.65it/s]
 16%|██████                               | 3652/22500 [00:29<02:59, 105.11it/s]
 16%|██████▏                               | 3663/22500 [00:29<03:08, 99.73it/s]
 16%|██████                               | 3670/22500 [00:29<02:31, 123.98it/s]

---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
/var/folders/3d/s79lt6q910z0whw_7c_z48lc0000gp/T/ipykernel_19426/1881718386.py in <module>
     17 
     18 # est_SK is the estimate and var_SK is the variance
---> 19 est_SK, var_SK = gs.Interpolation.skrige(Pred_grid_xy, df_grid, 'X', 'Y', 'Nbed', k, vario, rad)

~/opt/anaconda3/lib/python3.9/site-packages/gstatsim.py in skrige(prediction_grid, df, xx, yy, zz, num_points, vario, radius)
    747                 # covariance between data points
    748                 covariance_matrix.reshape(((new_num_pts)), ((new_num_pts)))
--> 749                 k_weights, res, rank, s = np.linalg.lstsq(covariance_matrix, 
    750                                                           covariance_array, rcond = None)
    751 

<__array_function__ internals> in lstsq(*args, **kwargs)

~/opt/anaconda3/lib/python3.9/site-packages/numpy/linalg/linalg.py in lstsq(a, b, rcond)
   2304         # lapack can't handle n_rhs = 0 - so allocate the array one larger in that axis
   2305         b = zeros(b.shape[:-2] + (m, n_rhs + 1), dtype=b.dtype)
-> 2306     x, resids, rank, s = gufunc(a, b, rcond, signature=signature, extobj=extobj)
   2307     if m == 0:
   2308         x[...] = 0

KeyboardInterrupt: 

Here, k is the number of conditioning nodes used to estimate a grid cell. This means that each estimate will be informed the by k nearby measurements within a specified search radius. The search radius should be at least as large as the largest measurement gap. If you’re getting errors, it is usually because the radius is too small. Generally speaking, increasing the search radius and number of conditioning nodes improves the simulation quality.

The simulation is applied to the transformed data, so a reverse normal score transformation must be applied to recover the original distribution.

# reverse normal score transformation

var_SK[var_SK < 0] = 0      # make sure variances are non-negative
std_SK = np.sqrt(var_SK)    # convert to standard deviation before back transforming

# reshape
est = est_SK.reshape(-1,1)
std = std_SK.reshape(-1,1)

# back transformation
spred_trans = nst_trans.inverse_transform(est)
sstd_trans = nst_trans.inverse_transform(std)
sstd_trans = sstd_trans - np.min(sstd_trans)
# plot simple kriging and standard deviation
fig = plt.figure()
plt.subplot(121) 
im = plt.scatter(Pred_grid_xy[:,0],Pred_grid_xy[:,1], c = spred_trans, vmin = -400, vmax = 600, marker=".", s = 50,
                cmap = 'gist_earth')       # scatter plot for location map
plt.title('Simple kriging')                                    # add plot title
plt.xlabel('X [m]'); plt.ylabel('Y [m]')                          # set axis labels
plt.locator_params(nbins=5)# set axis labels
cbar = plt.colorbar(im, orientation="vertical", ticks=np.linspace(-400, 600, 11)) # add vertical color bar
cbar.set_label("Bed elevation [m]", rotation=270, labelpad=20)      # add labels to the color bar
plt.subplots_adjust(left=0.0, bottom=0.0, right=1.0, top=1.0) # adjust the plot size
plt.axis('scaled')

plt.subplot(122) 
im = plt.scatter(Pred_grid_xy[:,0],Pred_grid_xy[:,1], c = sstd_trans, vmin = 0, vmax = 150, marker=".", s = 50,
                cmap = 'magma')       # scatter plot for location map
plt.title('Standard deviation')                                    # add plot title
plt.xlabel('X [m]'); plt.ylabel('Y [m]')                          # set axis labels
plt.locator_params(nbins=5)# set axis labels
cbar = plt.colorbar(im, orientation="vertical", ticks=np.linspace(0, 250, 6)) # add vertical color bar
cbar.set_label("Standard deviation [m]", rotation=270, labelpad=20)      # add labels to the color bar
plt.subplots_adjust(left=0.0, bottom=0.0, right=1.0, top=1.0) # adjust the plot size
plt.axis('scaled')

plt.subplots_adjust(left=0.0, bottom=0.0, right=2.0, top=1.0, wspace=0.2, hspace=0.3)
plt.show()
_images/3_Simple_kriging_and_ordinary_kriging_13_0.png
# make hillshade for visualizing
vmin = -400; vmax = 600
x_mat = Pred_grid_xy[:,0].reshape((rows, cols))
y_mat = Pred_grid_xy[:,1].reshape((rows, cols))
mat = spred_trans.reshape((rows, cols))
xmin = Pred_grid_xy[:,0].min(); xmax = Pred_grid_xy[:,0].max()
ymin = Pred_grid_xy[:,1].min(); ymax = Pred_grid_xy[:,1].max()
    
cmap=plt.get_cmap('gist_earth')
fig, ax = plt.subplots(1, figsize=(5,5))
im = ax.pcolormesh(x_mat, y_mat, mat, vmin=vmin, vmax=vmax, cmap=cmap)
    
# Shade from the northeast, with the sun 45 degrees from horizontal
ls = LightSource(azdeg=45, altdeg=45)
        
# leaving the dx and dy as 1 means a vertical exageration equal to dx/dy
hillshade = ls.hillshade(mat, vert_exag=1, dx=1, dy=1, fraction=1.0)
plt.pcolormesh(x_mat, y_mat, hillshade, cmap='gray', alpha=0.1)
plt.title('Simple kriging')
plt.xlabel('X [m]'); plt.ylabel('Y [m]')
plt.xticks(np.linspace(xmin, xmax, 5))
plt.yticks(np.linspace(ymin, ymax, 5))

# make colorbar
divider = make_axes_locatable(ax)
cax = divider.append_axes('right', size='5%', pad=0.1)
cbar = plt.colorbar(im, ticks=np.linspace(-400, 600, 11), cax=cax)
cbar.set_label("Bed elevation [m]", rotation=270, labelpad=15)
ax.axis('scaled')
plt.show()
_images/3_Simple_kriging_and_ordinary_kriging_14_0.png

The downside of simple kriging is that it assumes that all the data points are a residual from a constant mean across the area. This could give us a biased estimate, especially if there are differences in data density at different elevations. Let’s try ordinary kriging, where the mean is unknown.

Ordinary kriging#

Ordinary kriging (OK) uses a locally varying mean. This makes OK more robust to trends.

k = 100
est_OK, var_OK = gs.Interpolation.okrige(Pred_grid_xy, df_grid, 'X', 'Y', 'Nbed', k, vario, rad) # estimation and variance

# reverse normal score transformation
var_OK[var_OK < 0] = 0; # make sure variances are non-negative
std_OK = np.sqrt(var_OK) # convert to standard deviation (this should be done before back transforming!!!)

# reshape
est = est_OK.reshape(-1,1)
std = std_OK.reshape(-1,1)

# back transformation
pred_trans = nst_trans.inverse_transform(est)
std_trans = nst_trans.inverse_transform(std)
std_trans = std_trans - np.min(std_trans)
100%|████████████████████████████████████| 22500/22500 [03:00<00:00, 124.71it/s]
# plot ordinary kriging and stnadard deviation
fig = plt.figure()
plt.subplot(121) 
im = plt.scatter(Pred_grid_xy[:,0],Pred_grid_xy[:,1], c = pred_trans, vmin = -400, vmax = 600, marker=".", s = 50,
                cmap = 'gist_earth')       # scatter plot for location map
plt.title('Ordinary kriging')                                    # add plot title
plt.xlabel('X [m]'); plt.ylabel('Y [m]')                          # set axis labels
plt.locator_params(nbins=5)# set axis labels
cbar = plt.colorbar(im, orientation="vertical", ticks=np.linspace(-400, 600, 11)) # add vertical color bar
cbar.set_label("Bed elevation [m]", rotation=270, labelpad=20)      # add labels to the color bar
plt.subplots_adjust(left=0.0, bottom=0.0, right=1.0, top=1.0) # adjust the plot size
plt.axis('scaled')

plt.subplot(122) 
im = plt.scatter(Pred_grid_xy[:,0],Pred_grid_xy[:,1], c = std_trans, vmin = 0, vmax = 150, marker=".", s = 50,
                cmap = 'magma')       # scatter plot for location map
plt.title('Standard deviation')                                    # add plot title
plt.xlabel('X [m]'); plt.ylabel('Y [m]')                          # set axis labels
plt.locator_params(nbins=5)# set axis labels
cbar = plt.colorbar(im, orientation="vertical", ticks=np.linspace(0, 250, 6)) # add vertical color bar
cbar.set_label("Standard deviation [m]", rotation=270, labelpad=20)      # add labels to the color bar
plt.subplots_adjust(left=0.0, bottom=0.0, right=1.0, top=1.0) # adjust the plot size
plt.axis('scaled')

plt.subplots_adjust(left=0.0, bottom=0.0, right=2.0, top=1.0, wspace=0.2, hspace=0.3)
plt.show()
_images/3_Simple_kriging_and_ordinary_kriging_17_0.png
# make hillshade for visualizing
vmin = -400; vmax = 600
x_mat = Pred_grid_xy[:,0].reshape((rows, cols))
y_mat = Pred_grid_xy[:,1].reshape((rows, cols))
mat = pred_trans.reshape((rows, cols))
xmin = Pred_grid_xy[:,0].min(); xmax = Pred_grid_xy[:,0].max()
ymin = Pred_grid_xy[:,1].min(); ymax = Pred_grid_xy[:,1].max()
    
cmap=plt.get_cmap('gist_earth')
fig, ax = plt.subplots(1, figsize=(5,5))
im = ax.pcolormesh(x_mat, y_mat, mat, vmin=vmin, vmax=vmax, cmap=cmap)
    
# Shade from the northeast, with the sun 45 degrees from horizontal
ls = LightSource(azdeg=45, altdeg=45)
        
# leaving the dx and dy as 1 means a vertical exageration equal to dx/dy
hillshade = ls.hillshade(mat, vert_exag=1, dx=1, dy=1, fraction=1.0)
plt.pcolormesh(x_mat, y_mat, hillshade, cmap='gray', alpha=0.1)
plt.title('Ordinary kriging')
plt.xlabel('X [m]'); plt.ylabel('Y [m]')
plt.xticks(np.linspace(xmin, xmax, 5))
plt.yticks(np.linspace(ymin, ymax, 5))

# make colorbar
divider = make_axes_locatable(ax)
cax = divider.append_axes('right', size='5%', pad=0.1)
cbar = plt.colorbar(im, ticks=np.linspace(-400, 600, 11), cax=cax)
cbar.set_label("Bed elevation [m]", rotation=270, labelpad=15)
ax.axis('scaled')
plt.show()
_images/3_Simple_kriging_and_ordinary_kriging_18_0.png

Here, the righthand side of the ordinary kriging output is higher elevation than the simple kriging map. This is because ordinary kriging accounts for the large-scale trend of increasing elevation from left to right. However, there appear to be some artifacts in this region as well. This is because this area does not have many measurements, making it difficult to reliably estimate the local mean. This issue could be improved by increasing the search radius.

Difference between simple kriging and ordinary kriging#

# plot difference
fig = plt.figure()
im = plt.scatter(Pred_grid_xy[:,0],Pred_grid_xy[:,1], c = spred_trans - pred_trans, vmin = -50, vmax = 50, 
                 marker=".", s = 50, cmap = 'seismic')       # scatter plot for location map
plt.title('Simple kriging minus ordinary kriging')                                    # add plot title
plt.xlabel('X [m]'); plt.ylabel('Y [m]')                          # set axis labels
plt.locator_params(nbins=5)# set axis labels
cbar = plt.colorbar(im, orientation="vertical", ticks=np.linspace(-50, 50, 11), extend='min') # add vertical color bar
cbar.set_label("Difference [m]", rotation=270, labelpad=20)      # add labels to the color bar
plt.subplots_adjust(left=0.0, bottom=0.0, right=1.0, top=1.0) # adjust the plot size
plt.axis('scaled')
plt.show()
_images/3_Simple_kriging_and_ordinary_kriging_20_0.png

The differences tend to be the most pronounced in areas that are not near conditioning points.

Kriging roughness#

You’ll also notice that parts of kriging outputs look quite smooth. This is because kriging performs interpolations by averaging measurements. Let’s compare the variograms of the data and the kriging results to see how they differ.

Notice that we’re using the simple kriging values prior to back transformation because the experimental variogram was computed on normalized data.

# compute simple kriging variogram
coords_s = Pred_grid_xy
values_s = est_SK

VS = skg.Variogram(coords_s, values_s, bin_func='even', n_lags=n_lags, 
                   maxlag=maxlag, normalize=False)
# experimental variogram (from beginning of script)
xe = V1.bins
ye = V1.experimental

# simple kriging variogram
xs = VS.bins
ys = VS.experimental

plt.figure(figsize=(6,4))
plt.plot(xe,ye,'og', markersize=4, label='Bed data variogram')
plt.plot(xs,ys,'ob', markersize=4, label='Simple kriging variogram')
plt.title('Variogram comparison')
plt.xlabel('Lag [m]'); plt.ylabel('Semivariance')  
plt.legend(loc='upper left')
plt.show()
_images/3_Simple_kriging_and_ordinary_kriging_23_0.png

We can see that the kriging variogram is lower variance or smoother than the data. This is one of the main drawbacks of kriging - it cannot reproduce the statistical properties of observations.

Download the tutorial here.