You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi there! I'm looking into utilizing FFCV for genomics applications. In the process, I tried using the BytesField with a simple dataset to familiarize myself with its behavior. Am I using the API incorrectly?
For more context, I'm hoping to rapidly process DNA sequences with FFCV. To dramatically reduce on-disk footprint, I want to store variable length genotypes with FFCV, these are sufficient to reconstruct the much larger DNA sequences on-the-fly. In this setting, each instance from the dataset passed to FFCV would have two fields with a final length dimension that varies across instances.
"genotypes": shape = (2, length) dtype = int8
"positions": shape = (length) dtype = uintp
I'm hoping I can do this by implementing a dataset that views the data as uint8 and ravels it, and then add a transform to decode the data back to the intended shape and dtype. This could also reconstruct the DNA sequences which have uniform length across instances. Is this possible with FFCV? I would appreciate any recommendations, thank you!
The text was updated successfully, but these errors were encountered:
Hi there! I'm looking into utilizing FFCV for genomics applications. In the process, I tried using the BytesField with a simple dataset to familiarize myself with its behavior. Am I using the API incorrectly?
pip list | grep ffcv
= ffcv 1.0.2MRE
Expected
Actual
For more context, I'm hoping to rapidly process DNA sequences with FFCV. To dramatically reduce on-disk footprint, I want to store variable length genotypes with FFCV, these are sufficient to reconstruct the much larger DNA sequences on-the-fly. In this setting, each instance from the dataset passed to FFCV would have two fields with a final length dimension that varies across instances.
I'm hoping I can do this by implementing a dataset that views the data as uint8 and ravels it, and then add a transform to decode the data back to the intended shape and dtype. This could also reconstruct the DNA sequences which have uniform length across instances. Is this possible with FFCV? I would appreciate any recommendations, thank you!
The text was updated successfully, but these errors were encountered: