You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Being able to grid XYC categorical data (where C is the categorical feature) would be very useful.
I can think of at least two use cases:
LABELED DATA
1.1. Gridding of categorical geological data: for example a well performance classification that is not a production number, (high, medium, low), or a fracture intensity or other rock quality classification. Cross-validation would be important; in the case of well performance it may be nice to have the option for block cross-validation since wells often are drilled in clusters with relatively uniform reservoir (intra-area), but not necessarily homogeneous among clusters (inter-area).
1.2. gridding of geological facies. This is often done in the context of 3D geocellular modelling, but having a 2D implementation in Python would be great, with both options for cross-validation and using weights (if facies probabilities are available).
UNLABELED data
I am thinking here numerical categories such as output from clustering done with Gaussian Mixture Model. Data would be in XYCP format, where P is the probability output, and it would be great to be able to grid it using the probability as a weight. In this case cross-validation would not be possible because there is no label to us as ground truth.
Are you willing to help implement and maintain this feature? Yes/No
No. In the sense that I would not be available for coding; but I would definitely be interested and available as a tester.
The text was updated successfully, but these errors were encountered:
@mycarta that's an interesting use case. This might be a bit challenging because we're then getting into spatial prediction of things that aren't well represented by a surface under a load. So it's likely that the best predictors wouldn't be the coordinates of the points. Instead, you'd likely want to use other features. This is related to #188 by @fmaussion. I understand the use case better now and might be able to form ideas on a possible implementation.
So what we would need is a way to wrap a scikit-learn estimator into a Verde gridder. This shouldn't be too hard. The assumption would be that the feature matrix is a column stack of the given "coordinates". See #268. I think that could be a general solution for this.
Having the estimator wrapped by a gridder would allow use of any of our cross-validation tools.
Description of the desired feature
Being able to grid XYC categorical data (where C is the categorical feature) would be very useful.
I can think of at least two use cases:
LABELED DATA
1.1. Gridding of categorical geological data: for example a well performance classification that is not a production number, (high, medium, low), or a fracture intensity or other rock quality classification. Cross-validation would be important; in the case of well performance it may be nice to have the option for block cross-validation since wells often are drilled in clusters with relatively uniform reservoir (intra-area), but not necessarily homogeneous among clusters (inter-area).
1.2. gridding of geological facies. This is often done in the context of 3D geocellular modelling, but having a 2D implementation in Python would be great, with both options for cross-validation and using weights (if facies probabilities are available).
UNLABELED data
I am thinking here numerical categories such as output from clustering done with Gaussian Mixture Model. Data would be in XYCP format, where P is the probability output, and it would be great to be able to grid it using the probability as a weight. In this case cross-validation would not be possible because there is no label to us as ground truth.
Are you willing to help implement and maintain this feature? Yes/No
No. In the sense that I would not be available for coding; but I would definitely be interested and available as a tester.
The text was updated successfully, but these errors were encountered: