Replies: 4 comments
-
It is unclear how you'd do that. If you were to implement it, I would recommend a custom sampler for that purpose. |
Beta Was this translation helpful? Give feedback.
-
In refactor/multiaspect-dataset-loader I am working on a new framework for dataloading that allows more modularity. You could in theory create a module that extends these classes, and commandline argument switches to enable it at runtime. |
Beta Was this translation helpful? Give feedback.
-
in addition to having the ability to plug different samplers into the trainer, there are now data backends, such as Local or S3, which provide abstraction layer and allow the trainer to treat different storage media as the same thing. if you are interested, that could be used to write a webdataset backend that uses a csv file as its 'filesystem' and retrieves files using |
Beta Was this translation helpful? Give feedback.
-
this is getting closer, it occurred to me that moving all of the bucket read/write methods out of the aspect sampler into the BucketManager would help with this, because you'd just be able to plug a webdataset bucket manager into the sampler. |
Beta Was this translation helpful? Give feedback.
-
Do you have a solution if I have a huge dataset that is sequential in webdataset format and I still want to do aspect ratio bucketing training?
Beta Was this translation helpful? Give feedback.
All reactions