Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modis l2 available datasets #913

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
8 changes: 6 additions & 2 deletions satpy/etc/readers/modis_l2.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,10 @@ file_types:
file_patterns:
- 'M{platform_indicator:1s}D35_L2.A{acquisition_time:%Y%j.%H%M}.{collection:03d}.{production_time:%Y%j%H%M%S}.hdf'
file_reader: !!python/name:satpy.readers.modis_l2.ModisL2HDFFileHandler
modis_l2_product:
file_patterns:
- 'M{platform_indicator:1s}D{product:2s}_L2.A{acquisition_time:%Y%j.%H%M}.{collection:03d}.{production_time:%Y%j%H%M%S}.hdf'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One problem with this is that it overlaps with the mod35_hdf pattern so a file could be matched to either file type depending on which one was checked first. Either we have a file pattern for each possible level 2 filename or one single generic one. There are two problems with the single generic one:

  • We'd have to handle cloud_mask specially in available_datasets so that it is only shown as available for the mod35 file. Not a huge deal, but could be confusing to some.
  • Bigger issue is that if someone provides multiple level 2 files there is no way for Satpy to know which files are additional granules of the same file type (all MOD35 files) or if they are different level 2 product files. The reader ends up asking every file handler for a product even though only some of them have the dataset. I ran in to this issue with the ABI L2 data files even though those aren't granules.

In the future we could make the base reader check the file handler after it is created to see if it changed it's file type. For example the ModisL2HDFFileHandler could say "I see that 'product' in the filename is XX so my file type is actually 'modis_l2_product_XX'". The base reader then knows how to organize the files. @mraspaud did you have to do anything like this for the VIIRS SDR reader when you updated it for the various file naming schemes?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I guessed that this overlap might pose a problem. But I after experimenting with the order of the patterns I thought that the reader took the first match which would work in the current situation but might be difficult in the future if more datasets are added to the yaml file. I think it would be nice if the reader could also identify if any dataset in the file needs special treatment like bit decoding (then there would be no need for specifying additional datasets in the yaml file) . But this might not be easy, if doable at all.

The problem with different level 2 product files given to the reader at the same time also occurred to me but I thought that most users might be smart enough to keep product files in different directories but I guess that is a hopeful wish and sooner or later somebody will discover this "bug". I think I don't understand enough of the inner workings of how the readers are initialized (interested though to get a better understanding of what is done when and the idea behind it) to judge what would be the best way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. I'm 99% sure that there is no guarantee about the order of which file pattern/file type is checked first so it might come down to luck. Maybe let's continue the file type discussion on slack to nail down what I'm thinking. I could make a separate PR for the functionality I'm thinking of and then we can incorporate it in to this PR after it is merged.

file_reader: !!python/name:satpy.readers.modis_l2.ModisL2HDFFileHandler
hdf_eos_geo:
file_patterns:
- 'M{platform_indicator:1s}D03_A{start_time:%y%j_%H%M%S}_{processing_time:%Y%j%H%M%S}.hdf'
Expand Down Expand Up @@ -49,7 +53,7 @@ datasets:
5000:
file_type: mod35_hdf
1000:
file_type: [hdf_eos_geo, mod35_hdf]
file_type: [hdf_eos_geo, mod35_hdf, modis_l2_product]
500:
file_type: hdf_eos_geo
250:
Expand All @@ -64,7 +68,7 @@ datasets:
# For EUM reduced (thinned) files
file_type: mod35_hdf
1000:
file_type: [hdf_eos_geo, mod35_hdf]
file_type: [hdf_eos_geo, mod35_hdf, modis_l2_product]
500:
file_type: hdf_eos_geo
250:
Expand Down
3 changes: 2 additions & 1 deletion satpy/readers/hdfeos_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -164,8 +164,9 @@ def load_dataset(self, dataset_name):
good_mask = data != fill_value

scale_factor = data.attrs.get('scale_factor')
add_offset = data.attrs.get('add_offset', 0)
if scale_factor is not None:
data = data * scale_factor
data = data * scale_factor + add_offset

data = data.where(good_mask, new_fill)
return data
Expand Down
28 changes: 28 additions & 0 deletions satpy/readers/modis_l2.py
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,34 @@ def get_dataset(self, dataset_id, dataset_info):

return dataset

def available_datasets(self, configured_datasets):
"""
Adds dataset information not specifically specified in reader yaml file
from arbitrary modis level 2 product files to available datasets.

Notes:
Currently only adds 2D datasets and does not decode bit encoded information.
"""
# pass along existing datasets
for is_avail, ds_info in (configured_datasets or []):
yield is_avail, ds_info

res_dict = {(8120, 5416): 250, (4060, 2708): 500, (2030, 1354): 1000, (406, 270): 5000, (203, 135): 10000}
djhoese marked this conversation as resolved.
Show resolved Hide resolved

# get dynamic variables known to this file (that we created)
for var_name, val in self.sd.datasets().items():
if len(val[0]) == 2:
resolution = res_dict.get(val[1])
if not resolution is None:
djhoese marked this conversation as resolved.
Show resolved Hide resolved
ds_info = {
'file_type': self.filetype_info['file_type'],
'resolution': resolution,
'name': var_name,
'file_key': var_name,
'coordinates': ["longitude", "latitude"]
}
yield True, ds_info


def bits_strip(bit_start, bit_count, value):
"""Extract specified bit from bit representation of integer value.
Expand Down