Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficient representation of sparse data #17

Closed
iimog opened this issue Oct 6, 2016 · 1 comment
Closed

Efficient representation of sparse data #17

iimog opened this issue Oct 6, 2016 · 1 comment

Comments

@iimog
Copy link
Member

iimog commented Oct 6, 2016

As referees of our f1000 article note:

The in memory representation of the data following parse by BioJS are either in a dense matrix, or in a dict of keys style sparse representation. As the authors note, specialized methods will need to be created to handle large data efficiently, however the authors may wish to consider placing emphasis instead on specialized data structures such as compressed sparse row or column.

McDonald D and Bolyen E. Referee Report For: biojs-io-biom, a BioJS component for handling data in Biological Observation Matrix (BIOM) format [version 1; referees: 1 approved with reservations]. F1000Research 2016, 5:2348 (doi: 10.5256/f1000research.10362.r16546)

This is a very good point. Right now we only use the original sparse or dense representation as it is defined for the biom version 1.0 json. But depending on the input data a lot of memory can be saved by using specialized data structures to internally store the biom object on parse. It can then be transformed back to the json representation when write is called.

@iimog iimog added this to the v1.1.0 milestone Oct 6, 2016
@iimog
Copy link
Member Author

iimog commented Dec 13, 2016

Dedicated issue #35. Planned for future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant