Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for generic structured data files (json, yaml, toml) #830

Open
mferrera opened this issue Oct 1, 2024 · 1 comment
Open
Labels
Data definitions Issues related to data definitions

Comments

@mferrera
Copy link
Collaborator

mferrera commented Oct 1, 2024

Among the many files and formats consumed and produced by different FMU components are yaml files. Within FMU yaml is the go-to medium for configuration. fmu-dataio is not currently able to upload these files as data.

One reason that yaml may have become the go-to configuration for FMU is its readability. These configuration files double as reference material when users are QCing inputs and results. Configuration files may be produced with varied values determined at runtime that differ between ensembles or realizations, and then are used as input to other steps within the experiment. Hence they are valid and important data.

Data type

Currently we do not have a clear class to categorize yaml files.

class FMUClass(str, Enum):
"""The class of a data object by FMU convention or standards."""
case = "case"
realization = "realization"
iteration = "iteration"
surface = "surface"
table = "table"
cpgrid = "cpgrid"
cpgrid_property = "cpgrid_property"
polygons = "polygons"
cube = "cube"
well = "well"
points = "points"
dictionary = "dictionary"

Dictionary at first glance seems a viable candidate, but valid yaml will fail against it:

>>> yaml.load("""
... - Hesperiidae
... - Papilionidae
... - Apatelodidae
... - Epiplemidae
... """)

['Hesperiidae', 'Papilionidae', 'Apatelodidae', 'Epiplemidae']

The dictionary data class is made specifically to handle input that comes from a Python dict. Hence this will require another data class.

TODO

@mferrera mferrera added the Data definitions Issues related to data definitions label Oct 1, 2024
@mferrera mferrera changed the title Add support for yaml files Add support for structured data files Oct 2, 2024
@mferrera mferrera changed the title Add support for structured data files Add support for generic structured data files (json, yaml, toml) Oct 2, 2024
@mferrera
Copy link
Collaborator Author

mferrera commented Oct 3, 2024

A generic solution could be "serializable", that is, a Python object that is serializable into json. This is inclusive of dictionaries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Data definitions Issues related to data definitions
Projects
None yet
Development

No branches or pull requests

1 participant