HER method adapted for dealing with categorical data and probability maps, also including sequential simulation (HERs).
The code and datasets are complementary parts of the study proposed by Thiesen and Ehret (2021):
Thiesen, S.; Ehret, U. Assessing local and spatial uncertainty with nonparametric geostatistics, Stoch. Environ. Res. Risk. Assess. [preprint], https://doi.org/10.21203/rs.3.rs-272229/v1, in review, 2021.
The HER method comes with ABSOLUTELY NO WARRANTY. You are welcome to modify and redistribute it within the license agreement. The HER method is published under the CreativeCommons "CC-BY-4.0" license together with a ready-to-use sample data set. To view a full version of the license agreement please visit CC-BY-4.0.
- MATLAB (tested on 2018b).
See HER.m
- HER script ... .m
- functions/ ... .m
- datasets/ ... .mat
The main (sHER_E_0X_Y_Jura_struct.m) script is divided in the following sections:
1. Load dataset Loads the dataset.
2. Define infogram and HER3 properties Definition of the infogram properties, aggregation method, z threshold (optional).
3. HER1: Spatial characterization
Extracts spatial correlation patterns. f_her_infogram.m
4. HER2: Weight optimization
Optimizes weights for the aggregation method based on entropy minimization. f_her_weight.m
5. HER3: z PMF prediction
Applies spatial characterization and optimal weights for PMF prediction. f_her_predict.m
6. HERs4: Sequential Simulation
Applies spatial characterization and optimal weights for PMF prediction. f_her_predict.m
The sHER_E_0304_Extract_pmf_statistics_performance_and_plot.m script extract PMF statistics, calculate performance, and plot maps. It is composed by:
1. Extract PMF statistics Obtains mean, median, mode and probability of a z threshold (optional) of the predicted z PMFs and plots the results.
2. Calculate performance metrics Calculates Root Mean Square Error (RMSE), Mean Error (ME), Mean Absolute Error (MAE), Nash-Sutcliffe model efficiency and scoring rule (DKL) of the validation set.
__3. Plot maps
The functions are detailed in their own source code body. Examples of how to use them are available in the sHER.m
script.
f_plot_probabilitymap.m
f_plot
functions were specifically built for the dataset of the study.
Each dataset file contains:
- idx_rand_full: index of the randomly shuffled data (same for all files)
- sample_size: all calibration sizes available of the dataset (same for all files)
- data: matrix with z values of the full generated dataset
- txt: dataset type (SR0, SR1, LR0, LR1)
- idx_cal: index of the calibration set
- idx_val: index of the validation set
- idx_test: index of the test set
- x: matrix with x coordinates of the full dataset
- x_cal: vector with x coordinates of the calibration set (x_cal=x(idx_cal))
- x_val: vector with x coordinates of the validation set (x_val=x(idx_val))
- x_test: vector with x coordinates of the test set (x_test=x(idx_test))
- y: matrix with y coordinates of the full dataset
- y_cal: vector with y coordinates of the calibration set (y_cal=y(idx_cal))
- y_val: vector with y coordinates of the validation set (y_val=y(idx_val))
- y_test: vector with y coordinates of the test set (y_test=y(idx_test))
- z: matrix with z values of the full generated dataset (z=data)
- z_cal: vector with z values of the calibration dataset (z_cal=z(idx_cal))
- z_val: vector with z values of the validation dataset (z_val=z(idx_val))
- z_test: vector with z values of the test dataset (z_test=z(idx_test))
- dim_cal: size of the calibration set (dim_cal=length(idx_cal))
- dim_val: size of the validation set (dim_val=length(idx_val))
- dim_test: size of the test set (dim_test=length(idx_test))
Stephanie Thiesen | [email protected] Uwe Ehret | [email protected]