Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xarray can't identify time units in HSDS dataset #45

Open
rsignell-usgs opened this issue Dec 22, 2017 · 8 comments
Open

xarray can't identify time units in HSDS dataset #45

rsignell-usgs opened this issue Dec 22, 2017 · 8 comments
Assignees

Comments

@rsignell-usgs
Copy link
Contributor

rsignell-usgs commented Dec 22, 2017

In this notebook
https://gist.github.com/rsignell-usgs/07143a5ab54afb8ad6eb1af255d025c9
we use xarray to open a local netcdf4 file and then the same dataset that was 'hsload'ed to hsds.

xarray automatically recognized the CF-compliant time units and converts the time coordinate to datetime so that the plot is correctly labeled in cell [6].

But time is not recognized for the the HSDS dataset plot in cell [5].

Any idea what the problem is?

2017-12-22_18-26-38

@rsignell-usgs rsignell-usgs changed the title Time units not getting parsed by xarray hsds dataset time units not getting parsed by xarray Dec 22, 2017
@rsignell-usgs rsignell-usgs changed the title hsds dataset time units not getting parsed by xarray xarray can't identify time units in HSDS dataset Dec 22, 2017
@ghost
Copy link

ghost commented Dec 22, 2017

Would it be possible to print out what xarray thinks of that variable from the two sources? Have two cells with ds2['TMP_2maboveground'] and ds['TMP_2maboveground'].

@rsignell-usgs
Copy link
Contributor Author

@ajelenak-thg , yes, it looks like HSDS is dropping the variable attributes:
https://gist.github.com/rsignell-usgs/dbe88df42e1181827363a8348016f28b

BTW, you should be able to run this notebook (at least the HSDS and DAP access cells) -- you just need a username and password for this XSEDE endpoint from @jreadey in your ~/.hscfg, right?

If you sign up for XSEDE I can add you to my project, in case that becomes useful later on.

@ghost
Copy link

ghost commented Dec 23, 2017

Seems like some attributes of the time coordinate got lost "in translation" to HSDS. According to the DAS response from the THREDDS server:

    time {
        String units "seconds since 1970-01-01 00:00:00.0 0:00";
        String long_name "verification time generated by wgrib2 function verftime()";
        Float64 reference_time 1.4832288E9;
        Int32 reference_time_type 0;
        String reference_date "2017.01.01 00:00:00 UTC";
        String reference_time_description "kind of product unclear, reference date is variable, min found reference date is given";
        String time_step_setting "auto";
        Float64 time_step 3600.0;
        Int32 _ChunkSizes 512;
    }

_ChunkSizes does not really exist as an attribute, I think, because netCDF tools typically display HDF5 dataset creation properties as system attributes (prefixed with _).

The HSDS response about the attributes of the time coordinate shows only these (HDF5 dimension scale-related attributes not included): _Netcdf4Dimid, reference_time, reference_time_type, time_step. No units attribute so no conversion to datetime.

@rsignell-usgs
Copy link
Contributor Author

@ajelenak-thg and @jreadey, yes, HSDS is losing nearly all variable attributes!

The variable in the original NC file has attributes:

       float TMP_2maboveground(time, latitude, longitude) ;
                TMP_2maboveground:_FillValue = 9.999e+20f ;
                TMP_2maboveground:least_significant_digit = 2 ;
                TMP_2maboveground:short_name = "TMP_2maboveground" ;
                TMP_2maboveground:long_name = "Temperature" ;
                TMP_2maboveground:level = "2 m above ground" ;
                TMP_2maboveground:units = "K" ;

while in HSDS, the only remaining attribute is:

Attributes:
    least_significant_digit:  [2]

Does this mean perhaps that HSDS is only handing attributes with integer values or something?

@jreadey
Copy link
Member

jreadey commented Dec 27, 2017

@rsignell-usgs - do you see any errors during the import (with hsload)?

I've seen this issue: h5py/h5py#719 come up when loading NetCDF files.

@rsignell-usgs
Copy link
Contributor Author

rsignell-usgs commented Dec 27, 2017

Oh yes, I got tons of errors on hsload.

Looks like the real problem is here: h5py/h5py#719 (comment) :

The issue here is that recent versions of netCDF-C save the NC_CHAR dtype as fixed length UTF8 strings, which h5py cannot read.

So maybe hsload could translate NC_CHAR dtypes into something that h5pyd can read?

@ghost
Copy link

ghost commented Dec 27, 2017

I don't know if it is possible to get the bytes for such attributes somehow and avoid h5py until that issue is resolved.

@ghost
Copy link

ghost commented Jan 24, 2018

I have just created a PR with a fix for this problem: h5py/h5py#988. It works for the netCDF file used here. Let's what happens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants