diff --git a/.gitignore b/.gitignore index 0b149e5..62fb123 100644 --- a/.gitignore +++ b/.gitignore @@ -187,7 +187,7 @@ dev_scripts/ /notes/ /my_tests/ /data - +*data/ _version.py @@ -196,4 +196,6 @@ _version.py /papers/ -sphinx/_build/ \ No newline at end of file +sphinx/_build/ + +pyrightconfig.json \ No newline at end of file diff --git a/README.md b/README.md index 02641af..1d96d2d 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,8 @@ # MatGraphDB -**MatGraphDB** is a Python package designed to simplify graph-based data management and analysis in materials and molecular science. It enables researchers to efficiently transform complex theoretical data into structured graph representations, leveraging: +[Documentation][docs] | [PyPI][pypi] | [GitHub][github] + +**MatGraphDB** is a Python package designed to simplify graph-based data management and analysis in materials and molecular science. It is built on top of `ParquetGraphDB` [ParquetDB][parquetdb], which is a graph database which uses Apache Parquet for storage. It enables researchers to efficiently transform complex theoretical data into structured graph representations, leveraging: - **High-performance storage:** Utilizes Apache Parquet for scalable and rapid data access. - **Automated workflows:** Converts theoretical and computational data into graph structures. @@ -9,16 +11,11 @@ ## Table of Contents - [MatGraphDB](#matgraphdb) - [Table of Contents](#table-of-contents) - - [Documentation](#documentation) - [Installing](#installing) - [Usage](#usage) - [Contributing](#contributing) - [License](#license) - - -## Documentation - -Check out the [docs](https://matgraphdb.readthedocs.io/en/latest/) + - [Authors](#authors) ## Installing @@ -175,11 +172,11 @@ materials = mgdb.delete_materials(ids=[0]) ## Contributing -Contributions are welcome! Please open an issue or submit a pull request on GitHub. More information can be found in the [CONTRIBUTING.md](https://github.com/lllangWV/ParquetDB/blob/main/CONTRIBUTING.md) file. +Contributions are welcome! Please open an issue or submit a pull request on GitHub. More information can be found in the [CONTRIBUTING][contributing] file. ## License -This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details. +This project is licensed under the MIT License. See the [LICENSE][license] file for details. ## Authors @@ -194,3 +191,9 @@ Eduardo Hernandez, +[docs]: https://matgraphdb.readthedocs.io/en/latest/ +[pypi]: https://pypi.org/project/matgraphdb/ +[github]: https://github.com/romerogroup/MatGraphDB +[contributing]: https://github.com/romerogroup/MatGraphDB/blob/main/CONTRIBUTING.md +[license]: https://github.com/romerogroup/MatGraphDB/blob/main/LICENSE +[parquetdb]: https://github.com/lllangWV/ParquetDB \ No newline at end of file diff --git a/docs/source/CONTRIBUTING.rst b/docs/source/CONTRIBUTING.rst deleted file mode 100644 index c978d57..0000000 --- a/docs/source/CONTRIBUTING.rst +++ /dev/null @@ -1,89 +0,0 @@ -Contributing -================================== - -We welcome contributions and we hope that this guide will -facilitate an understanding of the ParquetDB code repository. It is -important to note that the ParquetDB software package is maintained on a -volunteer basis and thus we need to foster a community that can support -user questions and develop new features to make this software a useful -tool for all users. - -This page is dedicated to outline where you should start with your -question, concern, feature request, or desire to contribute. - -Being Respectful ------------------------------------ - -Please demonstrate empathy and kindness toward other people, other software, -and the communities who have worked diligently to build (un-)related tools. - -Please do not talk down in Pull Requests, Issues, or otherwise in a way that -portrays other people or their works in a negative light. - -Cloning the Source Repository ------------------------------------ - -You can clone the source repository from -``_ and install the latest version by -running: - -.. code:: bash - - git clone https://github.com/romerogroup/MatGraphDB.git - cd MatGraphDB - -Next, create a virtual envrionment and activate it. - -.. code:: bash - - conda create -n matgraphdb python==3.10 - conda activate matgraphdb - pip install -e .[docs] - -Change to the dev branch to add changes - -.. code:: bash - - git checkout dev - -Updating documentation ------------------------------------ - -The documentation for MatGraphDB are generated by using sphinx and the sphinx-gallery packages. -If you add code to the package, make sure to add the proper doc strings to be automatically generated. - -To generate the documentation you will need to run the following code from the top-level directory: - -.. code:: bash - - cd sphinx - make clean & make html - -This will clean the sphinx/_build directory and it will remove all aut-generated docs. -Once make html is called it will start generating the html files and store them in sphinx/_build. -After you have check the documentation and make sure there are no warnings or errors, -you will need to copy the contents of sphinx/_build/html/ to docs and save over -everything in that directory. This can be achieved by running the below code: - -.. code:: bash - - make deploy - - -Finally, you can push the changes to github. - -Running tests ------------------------------------ - -In the current version of MatGraphDB, we have added to external tests to test the functionality of the package. -To do this you will need to download the development data by running the following code. - -.. code:: python - - python tests/test_matgraphdb.py - python tests/test_types.py - -This will download the development data to MatGraphDB/data/examples. - -Now to run the tests, from the top directory run pytest - diff --git a/docs/source/_static/notebook.css b/docs/source/_static/notebook.css new file mode 100644 index 0000000..b90a126 --- /dev/null +++ b/docs/source/_static/notebook.css @@ -0,0 +1,6 @@ +/* cap all output at ~15 lines, then scroll */ +/* targets nbsphinx output areas too */ +.output_area pre { + max-height: calc(1.2em * 15) !important; + overflow-y: auto !important; + } \ No newline at end of file diff --git a/docs/source/conf.py b/docs/source/conf.py index 52388c6..95c9ea7 100644 --- a/docs/source/conf.py +++ b/docs/source/conf.py @@ -14,6 +14,7 @@ import shutil import sys from distutils.sysconfig import get_python_lib +from pathlib import Path from matgraphdb._version import version @@ -25,17 +26,25 @@ sys.path.insert(0, os.path.abspath(".")) -repo_root = os.path.abspath(os.path.join(os.path.dirname(__file__), "../..")) -src_examples_path = os.path.abspath(os.path.join(os.path.dirname(__file__), "examples")) -if os.path.exists(src_examples_path): - shutil.rmtree(src_examples_path) +SRC_DIR = Path(__file__).parent +REPO_ROOT = SRC_DIR.parent.parent +SRC_EXAMPLES_PATH = SRC_DIR / "examples" +REPO_EXAMPLES_PATH = REPO_ROOT / "examples" +CONTRIBUTING_PATH = REPO_ROOT / "CONTRIBUTING.md" -shutil.copytree(os.path.join(repo_root, "examples"), src_examples_path) -print(repo_root) -print(src_examples_path) -# examples_path = os.path.join(repo_root, "examples") -# sys.path.insert(0, examples_path) +print(f"REPO_ROOT: {REPO_ROOT}") +print(f"SRC_DIR: {SRC_DIR}") +print(f"SRC_EXAMPLES_PATH: {SRC_EXAMPLES_PATH}") + + +# Copy Repo Examples to docs source directory +if SRC_EXAMPLES_PATH.exists(): + shutil.rmtree(SRC_EXAMPLES_PATH) +shutil.copytree(REPO_EXAMPLES_PATH, SRC_EXAMPLES_PATH) + +shutil.copy(CONTRIBUTING_PATH, SRC_DIR / "CONTRIBUTING.md") + if os.environ.get("READTHEDOCS") == "True": @@ -88,6 +97,7 @@ "numpydoc", "sphinx.ext.autodoc", "sphinx.ext.autosummary", + "myst_parser", # "sphinx-nbexamples", # "sphinx_gallery.gen_gallery", # 'sphinx.youtube', @@ -96,6 +106,11 @@ nbsphinx_allow_errors = True pygments_style = "sphinx" +source_suffix = { + '.rst': 'restructuredtext', + '.txt': 'markdown', + '.md': 'markdown', +} # sphinx_gallery_conf = { # # convert rst to md for ipynb @@ -175,7 +190,8 @@ "logo_link": "index.html", # Specify the link for the logo if needed } -html_css_files = ["css/custom.css"] +html_css_files = ["css/custom.css", "notebook.css"] + html_js_files = ["js/custom.js"] diff --git a/docs/source/index.rst b/docs/source/index.rst index ef8d1de..4d74a1e 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -89,7 +89,7 @@ Index :glob: - 01_tutorials/index + examples/01_tutorials/index 02_internal/index examples/index.rst 03_api/index diff --git a/docs/source/01_tutorials/01 - Getting Started.ipynb b/examples/01_tutorials/01 - Getting Started.ipynb similarity index 100% rename from docs/source/01_tutorials/01 - Getting Started.ipynb rename to examples/01_tutorials/01 - Getting Started.ipynb diff --git a/docs/source/01_tutorials/02 - Managing Graphs in MatGraphDB.ipynb b/examples/01_tutorials/02 - Managing Graphs in MatGraphDB.ipynb similarity index 100% rename from docs/source/01_tutorials/02 - Managing Graphs in MatGraphDB.ipynb rename to examples/01_tutorials/02 - Managing Graphs in MatGraphDB.ipynb diff --git a/docs/source/01_tutorials/03 - Graph Generators in MatgraphDB.ipynb b/examples/01_tutorials/03 - Graph Generators in MatgraphDB.ipynb similarity index 100% rename from docs/source/01_tutorials/03 - Graph Generators in MatgraphDB.ipynb rename to examples/01_tutorials/03 - Graph Generators in MatgraphDB.ipynb diff --git a/docs/source/01_tutorials/index.rst b/examples/01_tutorials/index.rst similarity index 100% rename from docs/source/01_tutorials/index.rst rename to examples/01_tutorials/index.rst diff --git a/examples/notebooks/01 - Getting Started.ipynb b/examples/02_applications/01 - MPNearHull Dataset.ipynb similarity index 77% rename from examples/notebooks/01 - Getting Started.ipynb rename to examples/02_applications/01 - MPNearHull Dataset.ipynb index 695bd4c..8197d5f 100644 --- a/examples/notebooks/01 - Getting Started.ipynb +++ b/examples/02_applications/01 - MPNearHull Dataset.ipynb @@ -29,43 +29,15 @@ }, { "cell_type": "code", - "execution_count": 3, + "execution_count": 1, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Library imports and paths are set.\n" - ] - } - ], + "outputs": [], "source": [ "import os\n", - "import shutil\n", - "import zipfile\n", - "import gdown\n", - "\n", - "# Get the data directory from the config. You can change this to your own data directory.\n", - "DATA_DIR = os.path.join(\"..\",\"..\",\"data\",\"examples\",\"01\")\n", - "# Define the path to store the raw materials data.\n", - "MATERIALS_PATH = os.path.join(DATA_DIR, \"material\")\n", - "\n", - "MATGRAPHDB_PATH = os.path.join(DATA_DIR, \"MatGraphDB\")\n", - "\n", - "# Define the dataset URLs.\n", - "DATASET_URL = \"https://drive.google.com/uc?id=1zSmEQbV8pNvjWdhFuCwOeoOzvfoS5XKP\"\n", - "\n", - "# Define the URL for the raw materials data.\n", - "RAW_DATASET_URL = \"https://drive.google.com/uc?id=14guJqEK242XgRGEZA-zIrWyg4b-gX5zk\" # (Not used below but available)\n", + "from pathlib import Path\n", "\n", - "# # Define the path to store the raw materials data.\n", - "# RAW_DATASET_ZIP = os.path.join(config.data_dir, \"raw\", \"MPNearHull_v0.0.1_raw.zip\")\n", - "\n", - "# # Define the path to store the dataset.\n", - "# DATASET_ZIP = os.path.join(config.data_dir, \"datasets\", \"MPNearHull_v0.0.1.zip\")\n", - "\n", - "print(\"Library imports and paths are set.\")\n" + "FILE_DIR = Path(\".\")\n", + "DATA_DIR = FILE_DIR / \"data\"" ] }, { @@ -77,71 +49,94 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "Downloading raw materials data...\n" + "[INFO] 2025-05-11 10:21:41 - parquetdb.utils.config[37][load_config] - Config file: C:\\Users\\lllang\\AppData\\Local\\parquetdb\\parquetdb\\config.yml\n", + "[INFO] 2025-05-11 10:21:41 - parquetdb.utils.config[41][load_config] - Setting data_dir to C:\\Users\\lllang\\Desktop\\Current_Projects\\MatGraphDB\\data\n" ] }, { - "name": "stderr", - "output_type": "stream", - "text": [ - "Downloading...\n", - "From (original): https://drive.google.com/uc?id=1zSmEQbV8pNvjWdhFuCwOeoOzvfoS5XKP\n", - "From (redirected): https://drive.google.com/uc?id=1zSmEQbV8pNvjWdhFuCwOeoOzvfoS5XKP&confirm=t&uuid=5bcba796-ff8e-4bb3-bc09-39d3f1136dc1\n", - "To: c:\\Users\\lllang\\Desktop\\Current_Projects\\MatGraphDB\\examples\\notebooks\\materials\\MPNearHull_v0.0.1_raw.zip\n", - "100%|██████████| 632M/632M [00:11<00:00, 53.6MB/s] \n" - ] + "data": { + "application/vnd.jupyter.widget-view+json": { + "model_id": "85ff371ca9b0486cb3f75eeea12ca534", + "version_major": 2, + "version_minor": 0 + }, + "text/plain": [ + "Fetching 4 files: 0%| | 0/4 [00:00 4\u001b[0m mpdb \u001b[38;5;241m=\u001b[39m \u001b[43mMPNearHull\u001b[49m\u001b[43m(\u001b[49m\u001b[43mstorage_path\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mDB_PATH\u001b[49m\u001b[43m,\u001b[49m\u001b[43minitialize_from_scratch\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mFalse\u001b[39;49;00m\u001b[43m)\u001b[49m\n", + "File \u001b[1;32m~\\Desktop\\Current_Projects\\MatGraphDB\\matgraphdb\\datasets\\mp_near_hull.py:45\u001b[0m, in \u001b[0;36mMPNearHull.__init__\u001b[1;34m(self, storage_path, download, from_scratch, initialize_from_scratch)\u001b[0m\n\u001b[0;32m 38\u001b[0m logger\u001b[38;5;241m.\u001b[39minfo(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mDownloading dataset from \u001b[39m\u001b[38;5;132;01m{\u001b[39;00m\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mrepo_id\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m)\n\u001b[0;32m 39\u001b[0m snapshot_download(\n\u001b[0;32m 40\u001b[0m repo_id\u001b[38;5;241m=\u001b[39m\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mrepo_id,\n\u001b[0;32m 41\u001b[0m repo_type\u001b[38;5;241m=\u001b[39m\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mrepo_type,\n\u001b[0;32m 42\u001b[0m local_dir\u001b[38;5;241m=\u001b[39mstorage_path,\n\u001b[0;32m 43\u001b[0m )\n\u001b[1;32m---> 45\u001b[0m \u001b[38;5;28;43msuper\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[38;5;21;43m__init__\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43mstorage_path\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mstorage_path\u001b[49m\u001b[43m)\u001b[49m\n\u001b[0;32m 47\u001b[0m n_edge_generators \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mlen\u001b[39m(\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39medge_generator_store\u001b[38;5;241m.\u001b[39mgenerator_names)\n\u001b[0;32m 48\u001b[0m n_node_generators \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mlen\u001b[39m(\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mnode_generator_store\u001b[38;5;241m.\u001b[39mgenerator_names)\n", + "File \u001b[1;32m~\\Desktop\\Current_Projects\\MatGraphDB\\matgraphdb\\core\\matgraphdb.py:39\u001b[0m, in \u001b[0;36mMatGraphDB.__init__\u001b[1;34m(self, storage_path, materials_store, load_custom_stores, **kwargs)\u001b[0m\n\u001b[0;32m 28\u001b[0m \u001b[38;5;250m\u001b[39m\u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[0;32m 29\u001b[0m \u001b[38;5;124;03mParameters\u001b[39;00m\n\u001b[0;32m 30\u001b[0m \u001b[38;5;124;03m----------\u001b[39;00m\n\u001b[1;32m (...)\u001b[0m\n\u001b[0;32m 36\u001b[0m \u001b[38;5;124;03m Whether to load custom stores.\u001b[39;00m\n\u001b[0;32m 37\u001b[0m \u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[0;32m 38\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mstorage_path \u001b[38;5;241m=\u001b[39m os\u001b[38;5;241m.\u001b[39mpath\u001b[38;5;241m.\u001b[39mabspath(storage_path)\n\u001b[1;32m---> 39\u001b[0m \u001b[38;5;28msuper\u001b[39m()\u001b[38;5;241m.\u001b[39m\u001b[38;5;21m__init__\u001b[39m(\n\u001b[0;32m 40\u001b[0m storage_path\u001b[38;5;241m=\u001b[39m\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mstorage_path,\n\u001b[0;32m 41\u001b[0m load_custom_stores\u001b[38;5;241m=\u001b[39mload_custom_stores,\n\u001b[0;32m 42\u001b[0m \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs,\n\u001b[0;32m 43\u001b[0m )\n\u001b[0;32m 44\u001b[0m logger\u001b[38;5;241m.\u001b[39minfo(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mInitializing MatGraphDB at: \u001b[39m\u001b[38;5;132;01m{\u001b[39;00m\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mstorage_path\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m)\n\u001b[0;32m 46\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mmaterials_path \u001b[38;5;241m=\u001b[39m os\u001b[38;5;241m.\u001b[39mpath\u001b[38;5;241m.\u001b[39mjoin(\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mnodes_path, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmaterial\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n", + "File \u001b[1;32mc:\\Users\\lllang\\miniconda3\\envs\\matgraphdb\\lib\\site-packages\\parquetdb\\graph\\parquet_graphdb.py:71\u001b[0m, in \u001b[0;36mParquetGraphDB.__init__\u001b[1;34m(self, storage_path, load_custom_stores, verbose)\u001b[0m\n\u001b[0;32m 68\u001b[0m logger\u001b[38;5;241m.\u001b[39mdebug(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mGraph directory: \u001b[39m\u001b[38;5;132;01m{\u001b[39;00m\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mgraph_path\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m)\n\u001b[0;32m 70\u001b[0m \u001b[38;5;66;03m# Initialize empty dictionaries for stores, load existing stores\u001b[39;00m\n\u001b[1;32m---> 71\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mnode_stores \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_load_existing_node_stores\u001b[49m\u001b[43m(\u001b[49m\u001b[43mload_custom_stores\u001b[49m\u001b[43m)\u001b[49m\n\u001b[0;32m 72\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39medge_stores \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_load_existing_edge_stores(load_custom_stores)\n\u001b[0;32m 74\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39medge_generator_store \u001b[38;5;241m=\u001b[39m GeneratorStore(\n\u001b[0;32m 75\u001b[0m storage_path\u001b[38;5;241m=\u001b[39m\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39medge_generators_path, verbose\u001b[38;5;241m=\u001b[39m\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mverbose\n\u001b[0;32m 76\u001b[0m )\n", + "File \u001b[1;32mc:\\Users\\lllang\\miniconda3\\envs\\matgraphdb\\lib\\site-packages\\parquetdb\\graph\\parquet_graphdb.py:182\u001b[0m, in \u001b[0;36mParquetGraphDB._load_existing_node_stores\u001b[1;34m(self, load_custom_stores)\u001b[0m\n\u001b[0;32m 180\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[38;5;21m_load_existing_node_stores\u001b[39m(\u001b[38;5;28mself\u001b[39m, load_custom_stores: \u001b[38;5;28mbool\u001b[39m \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mTrue\u001b[39;00m):\n\u001b[0;32m 181\u001b[0m logger\u001b[38;5;241m.\u001b[39minfo(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mLoading existing node stores\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m--> 182\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_load_existing_stores\u001b[49m\u001b[43m(\u001b[49m\n\u001b[0;32m 183\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mnodes_path\u001b[49m\u001b[43m,\u001b[49m\n\u001b[0;32m 184\u001b[0m \u001b[43m \u001b[49m\u001b[43mdefault_store_class\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mNodeStore\u001b[49m\u001b[43m,\u001b[49m\n\u001b[0;32m 185\u001b[0m \u001b[43m \u001b[49m\u001b[43mload_custom_stores\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mload_custom_stores\u001b[49m\u001b[43m,\u001b[49m\n\u001b[0;32m 186\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n", + "File \u001b[1;32mc:\\Users\\lllang\\miniconda3\\envs\\matgraphdb\\lib\\site-packages\\parquetdb\\graph\\parquet_graphdb.py:216\u001b[0m, in \u001b[0;36mParquetGraphDB._load_existing_stores\u001b[1;34m(self, stores_path, default_store_class, load_custom_stores)\u001b[0m\n\u001b[0;32m 214\u001b[0m store_path \u001b[38;5;241m=\u001b[39m os\u001b[38;5;241m.\u001b[39mpath\u001b[38;5;241m.\u001b[39mjoin(stores_path, store_type)\n\u001b[0;32m 215\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m os\u001b[38;5;241m.\u001b[39mpath\u001b[38;5;241m.\u001b[39misdir(store_path):\n\u001b[1;32m--> 216\u001b[0m store_dict[store_type] \u001b[38;5;241m=\u001b[39m \u001b[43mload_store\u001b[49m\u001b[43m(\u001b[49m\n\u001b[0;32m 217\u001b[0m \u001b[43m \u001b[49m\u001b[43mstore_path\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mdefault_store_class\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mverbose\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mverbose\u001b[49m\n\u001b[0;32m 218\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[0;32m 219\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[0;32m 220\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mValueError\u001b[39;00m(\n\u001b[0;32m 221\u001b[0m \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mStore path \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mstore_path\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m is not a directory. Likely does not exist.\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[0;32m 222\u001b[0m )\n", + "File \u001b[1;32mc:\\Users\\lllang\\miniconda3\\envs\\matgraphdb\\lib\\site-packages\\parquetdb\\graph\\parquet_graphdb.py:950\u001b[0m, in \u001b[0;36mload_store\u001b[1;34m(store_path, default_store_class, verbose)\u001b[0m\n\u001b[0;32m 948\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m class_module \u001b[38;5;129;01mand\u001b[39;00m class_name \u001b[38;5;129;01mand\u001b[39;00m default_store_class \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[0;32m 949\u001b[0m logger\u001b[38;5;241m.\u001b[39mdebug(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mImporting class from module: \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mclass_module\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m--> 950\u001b[0m module \u001b[38;5;241m=\u001b[39m \u001b[43mimportlib\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mimport_module\u001b[49m\u001b[43m(\u001b[49m\u001b[43mclass_module\u001b[49m\u001b[43m)\u001b[49m\n\u001b[0;32m 951\u001b[0m class_obj \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mgetattr\u001b[39m(module, class_name)\n\u001b[0;32m 952\u001b[0m store \u001b[38;5;241m=\u001b[39m class_obj(storage_path\u001b[38;5;241m=\u001b[39mstore_path)\n", + "File \u001b[1;32mc:\\Users\\lllang\\miniconda3\\envs\\matgraphdb\\lib\\importlib\\__init__.py:126\u001b[0m, in \u001b[0;36mimport_module\u001b[1;34m(name, package)\u001b[0m\n\u001b[0;32m 124\u001b[0m \u001b[38;5;28;01mbreak\u001b[39;00m\n\u001b[0;32m 125\u001b[0m level \u001b[38;5;241m+\u001b[39m\u001b[38;5;241m=\u001b[39m \u001b[38;5;241m1\u001b[39m\n\u001b[1;32m--> 126\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43m_bootstrap\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_gcd_import\u001b[49m\u001b[43m(\u001b[49m\u001b[43mname\u001b[49m\u001b[43m[\u001b[49m\u001b[43mlevel\u001b[49m\u001b[43m:\u001b[49m\u001b[43m]\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mpackage\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mlevel\u001b[49m\u001b[43m)\u001b[49m\n", + "File \u001b[1;32m:1050\u001b[0m, in \u001b[0;36m_gcd_import\u001b[1;34m(name, package, level)\u001b[0m\n", + "File \u001b[1;32m:1027\u001b[0m, in \u001b[0;36m_find_and_load\u001b[1;34m(name, import_)\u001b[0m\n", + "File \u001b[1;32m:992\u001b[0m, in \u001b[0;36m_find_and_load_unlocked\u001b[1;34m(name, import_)\u001b[0m\n", + "File \u001b[1;32m:241\u001b[0m, in \u001b[0;36m_call_with_frames_removed\u001b[1;34m(f, *args, **kwds)\u001b[0m\n", + "File \u001b[1;32m:1050\u001b[0m, in \u001b[0;36m_gcd_import\u001b[1;34m(name, package, level)\u001b[0m\n", + "File \u001b[1;32m:1027\u001b[0m, in \u001b[0;36m_find_and_load\u001b[1;34m(name, import_)\u001b[0m\n", + "File \u001b[1;32m:992\u001b[0m, in \u001b[0;36m_find_and_load_unlocked\u001b[1;34m(name, import_)\u001b[0m\n", + "File \u001b[1;32m:241\u001b[0m, in \u001b[0;36m_call_with_frames_removed\u001b[1;34m(f, *args, **kwds)\u001b[0m\n", + "File \u001b[1;32m:1050\u001b[0m, in \u001b[0;36m_gcd_import\u001b[1;34m(name, package, level)\u001b[0m\n", + "File \u001b[1;32m:1027\u001b[0m, in \u001b[0;36m_find_and_load\u001b[1;34m(name, import_)\u001b[0m\n", + "File \u001b[1;32m:1004\u001b[0m, in \u001b[0;36m_find_and_load_unlocked\u001b[1;34m(name, import_)\u001b[0m\n", + "\u001b[1;31mModuleNotFoundError\u001b[0m: No module named 'matgraphdb.materials'" ] } ], "source": [ - "def download_raw_materials(mp_materials_path):\n", - " \"\"\"\n", - " Download and extract the raw materials data if it is not already present.\n", - " \"\"\"\n", - " if not os.path.exists(mp_materials_path):\n", - " \n", - " os.makedirs(mp_materials_path, exist_ok=True)\n", - " print(\"Downloading raw materials data...\")\n", - " \n", - " raw_dataset_zip = os.path.join(mp_materials_path, \"MPNearHull_v0.0.1_raw.zip\")\n", - " # Note: Here we use DATASET_URL as in the original code.\n", - " gdown.download(DATASET_URL, output=raw_dataset_zip, quiet=False)\n", - " \n", - " print(\"Extracting raw materials data...\")\n", - " with zipfile.ZipFile(raw_dataset_zip, \"r\") as zip_ref:\n", - " zip_ref.extractall(mp_materials_path)\n", - " \n", - " \n", - " files=os.listdir(mp_materials_path)\n", - " os.remove(raw_dataset_zip)\n", - " mp_nearhull_path = os.path.join(mp_materials_path, \"MPNearHull\")\n", - " tmp_materials_path = os.path.join(mp_nearhull_path, \"nodes\", \"material\")\n", - " materials_files = os.listdir(tmp_materials_path)\n", - " for file in materials_files:\n", - " shutil.move(os.path.join(tmp_materials_path, file), os.path.join(mp_materials_path, file))\n", - " \n", - " shutil.rmtree(mp_nearhull_path)\n", - " print(\"Raw materials data ready!\")\n", - " \n", - "# Optionally, download the raw materials data if you plan to initialize from raw files.\n", - "if not os.path.exists(MATERIALS_PATH):\n", - " download_raw_materials(MATERIALS_PATH)\n", - "else:\n", - " print(\"Raw materials data already exists.\")" + "from matgraphdb.datasets import MPNearHull\n", + "\n", + "DB_PATH = DATA_DIR / \"MPNearHull\"\n", + "mpdb = MPNearHull(storage_path=DB_PATH,initialize_from_scratch=False)" ] }, { @@ -1177,7 +1172,7 @@ ], "metadata": { "kernelspec": { - "display_name": "matgraphdb_dev", + "display_name": "matgraphdb", "language": "python", "name": "python3" }, @@ -1191,7 +1186,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.9.21" + "version": "3.10.0" }, "nbsphinx": { "execute": "never" diff --git a/examples/02_applications/index.rst b/examples/02_applications/index.rst new file mode 100644 index 0000000..cb8c92b --- /dev/null +++ b/examples/02_applications/index.rst @@ -0,0 +1,10 @@ +Applications +============ + +Welcome to the **Applications** section for MatGraphDB! Here, you'll find practical examples to help you master the core concepts quickly. + +.. toctree:: + :maxdepth: 3 + :caption: Applications + + 01 - MPNearHull Dataset diff --git a/examples/index.rst b/examples/index.rst index eeb2dcc..ddbbd16 100644 --- a/examples/index.rst +++ b/examples/index.rst @@ -6,14 +6,15 @@ Welcome to the MatGraphDB examples! This collection of notebooks demonstrates us These examples are automatically generated from the `examples directory`_ of the package and showcase how to effectively use MatGraphDB's features for data storage, querying, and management. Feel free to download and run these notebooks to explore the functionality firsthand. -.. _examples directory: https://github.com/romerogroup/MatGraphDB/tree/main/examples/notebooks + +.. _applications directory: https://github.com/romerogroup/MatGraphDB/tree/main/examples/02_applications .. nblinkgallery:: :caption: Example Gallery :name: rst-link-gallery - notebooks/01 - Getting Started + 02_applications/01 - MPNearHull Dataset Contents -------- @@ -22,4 +23,4 @@ Contents :maxdepth: 3 :caption: Example Gallery - notebooks/01 - Getting Started + 02_applications/01 - MPNearHull Dataset diff --git a/pyproject.toml b/pyproject.toml index 0644aa4..2379c5a 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -75,7 +75,8 @@ materials=[ docs= [ "ipython", "imageio-ffmpeg", - "sphinx", + "sphinx", + "myst_parser", "sphinx_rtd_theme", "sphinx-copybutton", "nbsphinx", @@ -90,31 +91,7 @@ docs= [ dev = [ - "pymongo", - "pytest", - "mlflow", - "ipywidgets", - "jupyterlab", - "nglview", - "pylint", - "autopep8", - "openai", - "python-dotenv", - "PyGithub", - "pytest-cov", - "ipython", - "imageio-ffmpeg", - "sphinx", - "sphinx_rtd_theme", - "sphinx-copybutton", - "nbsphinx", - "sphinx_design", - "sphinx-new-tab-link", - "sphinxcontrib-youtube", - "sphinxcontrib-video", - "pandoc", - "furo", - "numpydoc" + "matgraphdb[docs,tests]" ] diff --git a/pyrightconfig.json b/pyrightconfig.json deleted file mode 100644 index e4a72eb..0000000 --- a/pyrightconfig.json +++ /dev/null @@ -1,6 +0,0 @@ -{ - "exclude": [ - "data", - "logs" - ] -} \ No newline at end of file