Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in coefficient of determination of variogram fit. #205

Open
donhalmina opened this issue Aug 24, 2021 · 2 comments
Open

Error in coefficient of determination of variogram fit. #205

donhalmina opened this issue Aug 24, 2021 · 2 comments
Assignees
Labels
Documentation enhancement New feature or request question Further information is requested
Milestone

Comments

@donhalmina
Copy link

Hello

As per the code, you can estimate the coefficient of determination (r2) to compare the fit of theoretical covariance model with the experimental semivariogram.

para, pcov, r2 = fit_model.fit_variogram(bin_center, gamma, return_r2=True)

However, this is wrong as r2 is for linear regression and the covariance functions are not linear. This makes the r2 not credible.
I suggest using a different goodness-of-fit criteria such as "standard error of regression" instead of r2.

Thanks.

@LSchueler
Copy link
Member

Hi,
thanks for pointing that out, you are completely right.
Would you be willing to implement a better criteria and create a PR?

@LSchueler LSchueler added the bug Something isn't working label Aug 24, 2021
@MuellerSeb
Copy link
Member

Hi there!
Thanks for pointing this out. Since I implemented this, may give me the chance to provide my two cents:

A simple definition for the pseudo-R2 score can be given by:
pseudo-r2
(https://timeseriesreasoning.com/contents/r-squared-adjusted-r-squared-pseudo-r-squared/)
Where "D" stands for Deviance. Usually the pseudo-R2 (used for non-linear regressions) is used with Maximum Likelihood Estimation where the Deviance could be defined as the log-likelihood resulting in the McFadden R2 score.

If we just define the deviance as the sum of deviation squares, we result in the formula for the classical R2. In this context, the R2 score tells us, how much better the fitted model is compared to a nugget-model set to the mean of the estimated variogram values. I would argue that this information is quite useful but you are right, that we don't provide any justification for that although it is obviously not a linear regression. This could be a nice little research @LSchueler

Nonetheless we could provide other scores. "Standard error of regression" is a good start. The Standard error of regression is also very similar to the formula of the pseudo-R2 score shown above. Difference is only, that the sum of deviation squares is divided by the number of data points (and not the deviance from the mean) and you take the root to have the same unit as the input data and this means, the results from our example on this (link) should stay the same 😉

Cheers,
Sebastian

@MuellerSeb MuellerSeb self-assigned this Aug 25, 2021
@MuellerSeb MuellerSeb added Documentation enhancement New feature or request question Further information is requested and removed bug Something isn't working labels Aug 25, 2021
@MuellerSeb MuellerSeb added this to the 2.0 milestone Oct 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants