You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, as far as I understand, Datascope is compatible with any scikit-learn pipeline. I'm using PyTorch and skorch (library that wraps PyTorch) to make my classifier scikit-learn compatible.
I'm currently getting the following error when trying to compute the score:
ValueError Traceback (most recent call last)
[<ipython-input-49-2e03ddd68d36>](https://localhost:8080/#) in <module>()
----> 1 importances.score(test_data, test_labels)
3 frames
[/usr/local/lib/python3.7/dist-packages/datascope-0.0.3-py3.7-linux-x86_64.egg/datascope/importance/importance.py](https://localhost:8080/#) in score(self, X, y, **kwargs)
38 if isinstance(y, DataFrame):
39 y = y.values
---> 40 return self._score(X, y, **kwargs)
[/usr/local/lib/python3.7/dist-packages/datascope-0.0.3-py3.7-linux-x86_64.egg/datascope/importance/shapley.py](https://localhost:8080/#) in _score(self, X, y, **kwargs)
285 units = np.delete(units, np.where(units == -1))
286 world = kwargs.get("world", np.zeros_like(units, dtype=int))
--> 287 return self._shapley(self.X, self.y, X, y, self.provenance, units, world)
288
289 def _shapley(
[/usr/local/lib/python3.7/dist-packages/datascope-0.0.3-py3.7-linux-x86_64.egg/datascope/importance/shapley.py](https://localhost:8080/#) in _shapley(self, X, y, X_test, y_test, provenance, units, world)
314 )
315 elif self.method == ImportanceMethod.NEIGHBOR:
--> 316 return self._shapley_neighbor(X, y, X_test, y_test, provenance, units, world, self.nn_k, self.nn_distance)
317 else:
318 raise ValueError("Unknown method '%s'." % self.method)
[/usr/local/lib/python3.7/dist-packages/datascope-0.0.3-py3.7-linux-x86_64.egg/datascope/importance/shapley.py](https://localhost:8080/#) in _shapley_neighbor(self, X, y, X_test, y_test, provenance, units, world, k, distance)
507 assert isinstance(X_test, spmatrix)
508 X_test = X_test.todense()
--> 509 distances = distance(X, X_test)
510
511 # Compute the utilitiy values between training and test labels.
sklearn/metrics/_dist_metrics.pyx in sklearn.metrics._dist_metrics.DistanceMetric.pairwise()
ValueError: Buffer has wrong number of dimensions (expected 2, got 4)
Here's a snippet of my code:
from sklearn.pipeline import Pipeline
from sklearn.metrics import accuracy_score
net = reset_model(seed = 0) # gives scikit-learn compatible skorch model
pipeline = Pipeline([("model", net)])
pipeline.fit(train_dataset, train_labels)
y_pred = pipeline.predict(test_dataset)
plot_loss(net)
accuracy_dirty = accuracy_score(y_pred, test_labels)
print("Pipeline accuracy in the beginning:", accuracy_dirty)
The above works fine, and I'm able to compute the accuracy of my baseline model.
However, when trying to run importances.score(test_data, test_labels) I'm getting the error mentioned above.
from datascope.importance.common import SklearnModelAccuracy
from datascope.importance.shapley import ShapleyImportance
net = reset_model(seed = 0)
pipeline = Pipeline([("model", net)])
utility = SklearnModelAccuracy(pipeline)
importance = ShapleyImportance(method="neighbor", utility=utility)
importances = importance.fit(train_data, train_labels)
importances.score(test_data, test_labels)
Would be happy is someone could point me in the right direction! Not sure if this error is skorch related or the images are not supported yet? Thanks :)
The text was updated successfully, but these errors were encountered:
I have encountered a similar problem before, and it is because of the shape of my data. Could you try reshaping your train_data/test_data into (N, D) where N=# samples (2067 for your train_data) and D is the dimension? For D, you probably need some preprocessing, e.g., flatten that makes your 2D images (I assume) into 1D vectors.
If that does not work out, please let me know, and I will take a closer look as soon as possible.
Hi, as far as I understand, Datascope is compatible with any scikit-learn pipeline. I'm using PyTorch and skorch (library that wraps PyTorch) to make my classifier scikit-learn compatible.
I'm currently getting the following error when trying to compute the score:
Here's a snippet of my code:
The above works fine, and I'm able to compute the accuracy of my baseline model.
However, when trying to run
importances.score(test_data, test_labels)
I'm getting the error mentioned above.Here's the shape of my data:
Would be happy is someone could point me in the right direction! Not sure if this error is skorch related or the images are not supported yet? Thanks :)
The text was updated successfully, but these errors were encountered: