Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ROMS datasets no longer supported #3626

Open
StasJ opened this issue Jun 21, 2024 · 11 comments
Open

ROMS datasets no longer supported #3626

StasJ opened this issue Jun 21, 2024 · 11 comments
Assignees

Comments

@StasJ
Copy link
Collaborator

StasJ commented Jun 21, 2024

Attempting to import a ROMS dataset (like glade/data/Source/ROMS/Kauffman/b.n20.B1850CN.f09_g16.09.roms.ha.0001-03.nc) will result in an empty visualizer and no data variables.

@StasJ StasJ added the Bug label Jun 21, 2024
@NihanthCW NihanthCW added the High label Jun 25, 2024
@shaomeng shaomeng changed the title CAM datasets no longer supported ROMS datasets no longer supported Jul 2, 2024
@shaomeng
Copy link
Contributor

shaomeng commented Jul 2, 2024

  • Reason no data variable:

    • This particular ROMS file isn't really CF compliant. As a work around, VAPOR implements some logic so the variables can still be read. Unfortunately, this logic is broken, and the variables cannot be read anymore.
  • Nature of the work around:

    • code:
      //
      // If we still don't have lat and lon coordinate (or auxiliary)
      // variables for 'var' then we look for coordinate variables whose
      // dim names match the dim names of 'var'. Don't think this is
      // part of the CF 1.6 spec, but it seems necessary for ROMS data sets
      //
      //
      // If "coordinate variables" are specified for each dimension we're don
      //
      if (tmpcvars.size() != dimnames.size()) {
      if (!hasLatCoord) {
      vector<string> latcvs = NetCDFCFCollection::GetLatCoordVars();
      for (int i = 0; i < latcvs.size(); i++) {
      NetCDFSimple::Variable varinfo;
      NetCDFCollection::GetVariableInfo(latcvs[i], varinfo);
      vector<string> dns = varinfo.GetDimNames();
      if (dimnames.size() >= dns.size()) {
      vector<string> tmp(dimnames.end() - dns.size(), dimnames.end());
      if (tmp == dns) {
      tmpcvars.push_back(latcvs[i]);
      break;
      }
      }
      }
      }
      if (!hasLonCoord) {
      vector<string> loncvs = NetCDFCFCollection::GetLonCoordVars();
      for (int i = 0; i < loncvs.size(); i++) {
      NetCDFSimple::Variable varinfo;
      NetCDFCollection::GetVariableInfo(loncvs[i], varinfo);
      vector<string> dns = varinfo.GetDimNames();
      if (dimnames.size() >= dns.size()) {
      vector<string> tmp(dimnames.end() - dns.size(), dimnames.end());
      if (tmp == dns) {
      tmpcvars.push_back(loncvs[i]);
      break;
      }
      }
      }
      }
      }
      if (tmpcvars.size() != dimnames.size()) {
      SetErrMsg("Non-conforming CF variable : %s", var.c_str());
      return (-1);
      }
    • It appears that the work around is based on some logic to match certain variable names, but it's not clear how this logic really works.
    • The code author clearly states that this file isn't CF compliant, and the code is just a work around.
  • My question + thinking

    • Is it common for modern ROMS files to be CF compliant or not? Note that this file in question was produced in 2012, and all other files in the ROMS folder are from 2012 and 2013.
    • I normally avoid using guess work to just make one thing work, especially the thing itself is not standard compliant.

@shaomeng
Copy link
Contributor

shaomeng commented Jul 5, 2024

Hi @StasJ , you mentioned that release 3.9.2 was having this dataset imported correctly and variables identified correctly. Could you verify if it's really the case? In my test this file appeared broken since at least 2022.

@StasJ
Copy link
Collaborator Author

StasJ commented Jul 5, 2024

I tested /glade/campaign/cisl/vast/vapor/data/Source/ROMS/jsmall using the latest 3.9.2 release on Casper and it works.

@shaomeng
Copy link
Contributor

shaomeng commented Jul 6, 2024 via email

@shaomeng
Copy link
Contributor

shaomeng commented Jul 7, 2024

Hey @StasJ , I tested it again, and it seems if you include the grid file in the same Kauffman folder (NEP4_grid.nc) then the dataset loads correctly, using the current main branch. Do you mind giving it a try?

@StasJ
Copy link
Collaborator Author

StasJ commented Jul 8, 2024

Hey @shaomeng, you are correct, I apologize for the wild goose chase. I knew about the auxiliary grid file but I must've forgotten to import it when I reported the bug but remembered when I retested the issue. Maybe it would be worthwhile to have an error message for this scenario although I know that is easier said than done due to the heuristic nature of importing these datasets...

@shaomeng
Copy link
Contributor

shaomeng commented Jul 9, 2024

Hey @shaomeng, you are correct, I apologize for the wild goose chase. I knew about the auxiliary grid file but I must've forgotten to import it when I reported the bug but remembered when I retested the issue. Maybe it would be worthwhile to have an error message for this scenario although I know that is easier said than done due to the heuristic nature of importing these datasets...

Sounds good, I agree that a more meaningful message can be helpful here. I'll move forward to have it raised.

@shaomeng
Copy link
Contributor

Hi @StasJ , I think I'll need some input from you to figure out the best way to raise a message. In my attempt in #3634 , I declare that the dataset import is unsuccessful if there are no 2D nor 3D variables. This solution doesn't work well, because for 24May datasets, 3D variables indeed CAN be found later on, but not at the time of initialization. So, apparently, the test of whether there are 2D or 3D variables needs to be performed later in the pipeline. Do you have suggestions on where to do that test? Thanks!

@StasJ
Copy link
Collaborator Author

StasJ commented Jul 19, 2024

@shaomeng The issue you are running into is due to the fact that you are using GetDataVariableNames rather than GetVariableNames. It appears the functionality of GetDataVariableNames is not fully complete in NetCDFCFCollection. You could use GetVariableNames here but I would instead add this check at the end of initialization in DCCF since that includes the full logic for distinguishing data variables.

@shaomeng
Copy link
Contributor

Hi @StasJ , thanks for the suggestion; I wasn't aware of the functionality of DCCF but it makes total sense after you point it out. I'll see if the updated code passes the smoke test. Thanks!

@NihanthCW
Copy link
Collaborator

Does ROMS always save a separate gridfile and data files? If yes, we could have a separate ROMS loader with the option to select a grid file. Provide an explicit option.

@NihanthCW NihanthCW removed the Bug label Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants