Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarifications in docs for data loading #58

Open
cgoliver opened this issue Aug 26, 2022 · 0 comments
Open

Clarifications in docs for data loading #58

cgoliver opened this issue Aug 26, 2022 · 0 comments

Comments

@cgoliver
Copy link

cgoliver commented Aug 26, 2022

The Quickstart and the data handling page have the following example which does not work 'as-is'.

>>> import atom3d.datasets.datasets as da
>>> da.download_dataset('lba', PATH_TO_DATASET) # Download LBA dataset
>>> import atom3d.datasets as da
>>> dataset = da.load_dataset(PATH_TO_DATASET, 'lmdb') # Load LMDB format dataset
>>> print(len(dataset))  # Print length
>>> print(dataset[0].keys()) # Print keys stored in first structure

Some notes:

  1. The variable PATH_TO_DATASET cannot have the same value in the download_dataset and the load_dataset calls since the latter requires a path to a subfolder of PATH_TO_DATASET
  2. The doc uses SPLIT_NAME without providing a sample value so the user has to guess the options.
  3. I would modify this to either set the values of the variables so that the user can copy paste the example directly, or state what values a user should give.

Setting PATH_TO_DATASET='./foo' and running the example as-is results in the following error:

>>> da.download_dataset(
>>> data = da.load_dataset(PATH_TO_DATASET, 'lmdb')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<...>/.venv/lib/python3.9/site-packages/atom3d/datasets/datasets.py", line 426, in load_dataset
    dataset = LMDBDataset(file_list, transform=transform)
  File "<...>/.venv/lib/python3.9/site-packages/atom3d/datasets/datasets.py", line 58, in __init__
    env = lmdb.open(str(self.data_file), max_readers=1, readonly=True,
lmdb.Error: <...>/foo: No such file or directory

The second example which was not working is the load_example_dataset also in the Using datasets page.

This is the snippet from the docs:

>>> from atom3d.data.example import load_example_dataset
>>> dataset = load_example_dataset()

Running this produces the following error:

>>> dataset = load_example_dataset()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<...>/venv/lib/python3.9/site-packages/atom3d/data/example.py", line 23, in load_example_dataset
    dataset = da.load_dataset(str(Path(__file__).parent.absolute()) + '/test_lmdb', 'lmdb')
  File "<...>/venv/lib/python3.9/site-packages/atom3d/datasets/datasets.py", line 426, in load_dataset
    dataset = LMDBDataset(file_list, transform=transform)
  File "<...>/.venv/lib/python3.9/site-packages/atom3d/datasets/datasets.py", line 56, in __init__
    raise FileNotFoundError(self.data_file)
FileNotFoundError: <...>/.venv/lib/python3.9/site-packages/atom3d/data/test_lmdb

Version

python: 3.9.13
atom3d: 'v0.2.6'
os: MacOS 10.15.7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant