-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: geojson is not read correctly with geopandas>=1.0.0 #445
Comments
Thanks for the report! I can confirm that with the new default IO engine pyogrio, this indeed returns a string. A workaround is to use the old engine that was default pre 1.0. uhslc_gpd = gpd.read_file("https://uhslc.soest.hawaii.edu/data/meta.geojson", engine="fiona") @brendan-ward will know more whether this is expected or something we need to process differently in pyogrio. |
@martinfleis thanks a lot for this useful suggestion, this conveniently solves the issue I had at least on my side. However, the engine string seems to be case sensitive, so it should be |
I'll keep this open and move it to pyogrio as we may want to look into that there. |
It looks like there is a field type On the Pyogrio side, we need to detect this subtype and carry through that info when deserializing / serializing fields. Serializing is likely to be harder because the numpy array dtype does not give us this info - so there may be a real performance penalty there (or we leave this the responsibility of the user). For now, you could also manually parse applicable fields to import json
uhslc_gpd = gpd.read_file("https://uhslc.soest.hawaii.edu/data/meta.geojson")
uhslc_gpd["rq_span"] = uhslc_gpd.rq_span.apply(json.loads) |
Thanks for the suggestion. That would also work indeed, but "rq_span" is not the only field that requires conversion, so for my application I prefer the fiona approach for now. |
FYI, |
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of geopandas.
(optional) I have confirmed this bug exists on the main branch of geopandas.
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
Problem description
The above code raises
"TypeError: string indices must be integers, not 'str'"
in geopandas>=1.0.0. For older versions the code runs successfully. The issue is that the column now contains strings with dicts instead of plain dicts. It seems that something goes wrong with the parsing of the geojson.Expected Output
A subset of the original column.
Output of
geopandas.show_versions()
SYSTEM INFO
python : 3.11.6 | packaged by conda-forge | (main, Oct 3 2023, 10:29:11) [MSC v.1935 64 bit (AMD64)]
executable : C:\Users\veenstra\Anaconda3\envs\dfm_tools_env\python.exe
machine : Windows-10-10.0.19045-SP0
GEOS, GDAL, PROJ INFO
GEOS : 3.11.2
GEOS lib : None
GDAL : 3.8.5
GDAL data dir: C:\Users\veenstra\Anaconda3\envs\dfm_tools_env\Lib\site-packages\pyogrio\gdal_data
PROJ : 9.3.0
PROJ data dir: C:\Users\veenstra\Anaconda3\envs\dfm_tools_env\Lib\site-packages\pyproj\proj_dir\share\proj
PYTHON DEPENDENCIES
geopandas : 1.0.0
numpy : 1.26.4
pandas : 2.2.2
pyproj : 3.6.1
shapely : 2.0.2
pyogrio : 0.9.0
geoalchemy2: None
geopy : 2.4.1
matplotlib : 3.8.4
mapclassify: None
fiona : 1.9.5
psycopg : None
psycopg2 : None
pyarrow : None
The text was updated successfully, but these errors were encountered: