Error in coefficient of determination of variogram fit. #205

donhalmina · 2021-08-24T10:03:25Z

Hello

As per the code, you can estimate the coefficient of determination (r2) to compare the fit of theoretical covariance model with the experimental semivariogram.

para, pcov, r2 = fit_model.fit_variogram(bin_center, gamma, return_r2=True)

However, this is wrong as r2 is for linear regression and the covariance functions are not linear. This makes the r2 not credible.
I suggest using a different goodness-of-fit criteria such as "standard error of regression" instead of r2.

Thanks.

The text was updated successfully, but these errors were encountered:

LSchueler · 2021-08-24T11:05:39Z

Hi,
thanks for pointing that out, you are completely right.
Would you be willing to implement a better criteria and create a PR?

MuellerSeb · 2021-08-25T16:47:41Z

Hi there!
Thanks for pointing this out. Since I implemented this, may give me the chance to provide my two cents:

A simple definition for the pseudo-R2 score can be given by:

(https://timeseriesreasoning.com/contents/r-squared-adjusted-r-squared-pseudo-r-squared/)
Where "D" stands for Deviance. Usually the pseudo-R2 (used for non-linear regressions) is used with Maximum Likelihood Estimation where the Deviance could be defined as the log-likelihood resulting in the McFadden R2 score.

If we just define the deviance as the sum of deviation squares, we result in the formula for the classical R2. In this context, the R2 score tells us, how much better the fitted model is compared to a nugget-model set to the mean of the estimated variogram values. I would argue that this information is quite useful but you are right, that we don't provide any justification for that although it is obviously not a linear regression. This could be a nice little research @LSchueler

Nonetheless we could provide other scores. "Standard error of regression" is a good start. The Standard error of regression is also very similar to the formula of the pseudo-R2 score shown above. Difference is only, that the sum of deviation squares is divided by the number of data points (and not the deviance from the mean) and you take the root to have the same unit as the input data and this means, the results from our example on this (link) should stay the same 😉

Cheers,
Sebastian

LSchueler added the bug Something isn't working label Aug 24, 2021

MuellerSeb self-assigned this Aug 25, 2021

MuellerSeb added Documentation enhancement New feature or request question Further information is requested and removed bug Something isn't working labels Aug 25, 2021

MuellerSeb added this to the 2.0 milestone Oct 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in coefficient of determination of variogram fit. #205

Error in coefficient of determination of variogram fit. #205

donhalmina commented Aug 24, 2021

LSchueler commented Aug 24, 2021

MuellerSeb commented Aug 25, 2021

Error in coefficient of determination of variogram fit. #205

Error in coefficient of determination of variogram fit. #205

Comments

donhalmina commented Aug 24, 2021

LSchueler commented Aug 24, 2021

MuellerSeb commented Aug 25, 2021