Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem of building own dataset #90

Open
chufall opened this issue Aug 8, 2024 · 2 comments
Open

Problem of building own dataset #90

chufall opened this issue Aug 8, 2024 · 2 comments

Comments

@chufall
Copy link

chufall commented Aug 8, 2024

Hi, I'm trying to build a xarray dataset with mine own data, and the code is as following:

`numpy_array = data

lons = np.linspace(-180, 180, 73)
lats = np.linspace(87.5, -87.5, 71)

# mydatetime=
# dims = ('batch', 'time', 'channel', 'longitude', 'latitude')
times = pd.to_timedelta([f'{i:2d}:00:00' for i in range(data.shape[1])])
datetimes = pd.date_range(start=f'{start_year}-01-01', end=f'{end_year}-12-31 23:00:00', freq='H')

coords = {
    'batch': range(data.shape[0]),
    'time': ('time',times),
    'channel':[0],
    'lon': ('lon', lons),
    'lat': ('lat',lats),
    'datetime': (('batch','time'), [datetimes])
}

data_vars = {'Temp': (('batch', 'time', 'channel', 'lat', 'lon'), numpy_array[:, :, 0:1, :, :])}
xr_dataset = xarray.Dataset(data_vars=data_vars, coords=coords)`

but in the result dataset ,
the coord "datetime" which type is xarray.DataArray has own coords and dims "batch" and "time" , such like:

<xarray.DataArray 'datetime' (batch: 1, time: 17544)> Size: 140kB
array([['2020-01-01T00:00:00.000000000', '2020-01-01T01:00:00.000000000',
        '2020-01-01T02:00:00.000000000', ...,
        '2021-12-31T21:00:00.000000000', '2021-12-31T22:00:00.000000000',
        '2021-12-31T23:00:00.000000000']], dtype='datetime64[ns]')
Coordinates:
  * batch     (batch) int64 8B 0
  * time      (time) timedelta64[ns] 140kB 00:00:00 ... 730 days 23:00:00
    datetime  (batch, time) datetime64[ns] 140kB 2020-01-01 ... 2021-12-31T23...

However if I load the example dataset "source-era5_date-2022-01-01_res-0.25_levels-37_steps-01.nc", whose coord "datetime" has "batch" and "time" dims ,but only the "time" coord. such like :

<xarray.DataArray 'datetime' (batch: 1, time: 3)> Size: 24B
array([['2022-01-01T00:00:00.000000000', '2022-01-01T06:00:00.000000000',
        '2022-01-01T12:00:00.000000000']], dtype='datetime64[ns]')
Coordinates:
  * time      (time) timedelta64[ns] 24B 00:00:00 06:00:00 12:00:00
    datetime  (batch, time) datetime64[ns] 24B 2022-01-01 ... 2022-01-01T12:0...
Dimensions without coordinates: batch

So my question is how to remove the coord "batch" of the datetime in my dataset?

Thanks a lot!

Sincerely, Qc

@agbruno-git
Copy link

Try to use .squeeze() on your dataset, it should remove the dependence.

@chufall
Copy link
Author

chufall commented Aug 9, 2024

Try to use .squeeze() on your dataset, it should remove the dependence.

Thank you for your reply!

I have tried by run:
xr_dataset.coords["datetime"].squeeze()
The result is no changed.
There's the coord 'batch in the "datetime"

if I run the : xr_dataset.squeeze() , it will remove all the dims=1 , which is not my option

Qc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants