You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am attempting to import and run the presidio-analyzer and presidio-anonymizer in an Azure databricks environment. However, there seems to be dependency mismatch issues with the python packages that are part of the databricks runtime. For example, I am getting the error below which indicates a numpy mismatch. Can anyone offer some advice on getting up and running on databricks given the mismatch of python package versions between the Presidio libraries and the databricks runtime? The basic setup in the docs doesn't seem to work and I haven't been able to find any additional information on overcoming these issues.
One other thing I wanted to mention is that tooling for package conflicts in python seems limited. For example, the PyPi repo for presidio-analyzer doesn't list any dependency requirements other than the supported python versions,
Thanks!
Command:
from presidio_analyzer import AnalyzerEngine
from presidio_anonymizer import AnonymizerEngine
Error: ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
numpy version in databricks runtime:
numpy==1.20.1
EDIT: I am considering upgrading the databricks runtime version. Currently using 10.4 LST which seems somewhat outdated. However, would still love to hear any feedback / suggestions from others on other options.
The text was updated successfully, but these errors were encountered:
Hello,
I am attempting to import and run the presidio-analyzer and presidio-anonymizer in an Azure databricks environment. However, there seems to be dependency mismatch issues with the python packages that are part of the databricks runtime. For example, I am getting the error below which indicates a numpy mismatch. Can anyone offer some advice on getting up and running on databricks given the mismatch of python package versions between the Presidio libraries and the databricks runtime? The basic setup in the docs doesn't seem to work and I haven't been able to find any additional information on overcoming these issues.
One other thing I wanted to mention is that tooling for package conflicts in python seems limited. For example, the PyPi repo for presidio-analyzer doesn't list any dependency requirements other than the supported python versions,
Thanks!
Command:
Error:
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
numpy version in databricks runtime:
numpy==1.20.1
EDIT: I am considering upgrading the databricks runtime version. Currently using 10.4 LST which seems somewhat outdated. However, would still love to hear any feedback / suggestions from others on other options.
The text was updated successfully, but these errors were encountered: