Remove automated sorting of data #812

danielhuppmann · 2024-02-14T10:29:11Z

The pyam package currently automatically sorts the _data series and meta dataframe by their index. This makes it easy for consistency, assert-frame-equal and some operations like interpolation. But it can have unintended consequences in cases where ordering is forgotten, e.g. #811

Also, the repeated ordering is probably not very resource-efficient for large IamDataFrame instances.

For pyam 3.0, I suggest to drop the automated ordering on initialization and rename/aggregation/etc. methods, and instead provide a sort() method that can be called explicitly. We could also have a kwarg on all relevant methods whether to sort, but that may not effective on the effort-vs.-benefit trade-off.

@phackstock @gidden @znicholls, any thoughts?

The text was updated successfully, but these errors were encountered:

phackstock · 2024-02-14T15:23:11Z

I like the idea of making sorting optional. I cannot really think of a use case off the top of my head where I care or depend on the order of data.
For assert-frame-equal we would then also introduce a keyword argument that would switch whether or not order is considered when checking for equality.

danielhuppmann · 2024-02-21T14:28:33Z

Reminder: not sorting the time column may cause confusion when working with the wide timeseries format (e.g., write to xlsx)

danielhuppmann added this to the Release 3.0 milestone Aug 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove automated sorting of data #812

Remove automated sorting of data #812

danielhuppmann commented Feb 14, 2024

phackstock commented Feb 14, 2024

danielhuppmann commented Feb 21, 2024

Remove automated sorting of data #812

Remove automated sorting of data #812

Comments

danielhuppmann commented Feb 14, 2024

phackstock commented Feb 14, 2024

danielhuppmann commented Feb 21, 2024