Skip to content

Clarifications in docs for data loading #58

@cgoliver

Description

@cgoliver

The Quickstart and the data handling page have the following example which does not work 'as-is'.

>>> import atom3d.datasets.datasets as da
>>> da.download_dataset('lba', PATH_TO_DATASET) # Download LBA dataset
>>> import atom3d.datasets as da
>>> dataset = da.load_dataset(PATH_TO_DATASET, 'lmdb') # Load LMDB format dataset
>>> print(len(dataset))  # Print length
>>> print(dataset[0].keys()) # Print keys stored in first structure

Some notes:

  1. The variable PATH_TO_DATASET cannot have the same value in the download_dataset and the load_dataset calls since the latter requires a path to a subfolder of PATH_TO_DATASET
  2. The doc uses SPLIT_NAME without providing a sample value so the user has to guess the options.
  3. I would modify this to either set the values of the variables so that the user can copy paste the example directly, or state what values a user should give.

Setting PATH_TO_DATASET='./foo' and running the example as-is results in the following error:

>>> da.download_dataset(
>>> data = da.load_dataset(PATH_TO_DATASET, 'lmdb')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<...>/.venv/lib/python3.9/site-packages/atom3d/datasets/datasets.py", line 426, in load_dataset
    dataset = LMDBDataset(file_list, transform=transform)
  File "<...>/.venv/lib/python3.9/site-packages/atom3d/datasets/datasets.py", line 58, in __init__
    env = lmdb.open(str(self.data_file), max_readers=1, readonly=True,
lmdb.Error: <...>/foo: No such file or directory

The second example which was not working is the load_example_dataset also in the Using datasets page.

This is the snippet from the docs:

>>> from atom3d.data.example import load_example_dataset
>>> dataset = load_example_dataset()

Running this produces the following error:

>>> dataset = load_example_dataset()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<...>/venv/lib/python3.9/site-packages/atom3d/data/example.py", line 23, in load_example_dataset
    dataset = da.load_dataset(str(Path(__file__).parent.absolute()) + '/test_lmdb', 'lmdb')
  File "<...>/venv/lib/python3.9/site-packages/atom3d/datasets/datasets.py", line 426, in load_dataset
    dataset = LMDBDataset(file_list, transform=transform)
  File "<...>/.venv/lib/python3.9/site-packages/atom3d/datasets/datasets.py", line 56, in __init__
    raise FileNotFoundError(self.data_file)
FileNotFoundError: <...>/.venv/lib/python3.9/site-packages/atom3d/data/test_lmdb

Version

python: 3.9.13
atom3d: 'v0.2.6'
os: MacOS 10.15.7

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions