This open-source software is a Python package made for external data scientists without high knowledge of the MELLODDY stack to perform predictions on new data easily, from the models produced during the yearly runs. It is built on top of Melloddy-Tuner and Sparsechem to manage both data pre-processing and model inference steps. It is flexible enough to handle multiple models and data size, and predict on subset on tasks.
⚠️ The model should be compatible withsparsechem0.9.6+. If it is not, you can convert it with theconvert.pyscript fromsparsechem.
To install the package, run in a new python 3.8+ environment:
-
Clone the repository
git clone git@github.com:melloddy/MELLODDY-Predictor.git cd MELLODDY-Predictor -
Install the package and its requirements. You can remove
githubif you already installedmelloddy_tunerandsparsechem. You can removedocif you don't want to build it.pip install -e ".[github,doc]" -
(Optional) To be able to run the examples and the tests, download the dummy files. You can download it using
make inputs. Otherwise download the archive, extract it an place theinputsfolder at the root of the project.
To build the doc, run in a new terminal with your python environment:
make docThen in your browser, go to http://localhost:8080/melloddy_predictor/
You can see an example in example.py and run it with:
python examples/example.pyInstall all the requirements
pip install -r requirements-dev.txtWe use pytest for testing, you can just run the following command to run the full test suite:
make testNote that input data will be downloaded from Zenodo when running the tests for the first time.
To set up pre-commits, you can run:
pre-commit installto lint all files, you can run
make lintIf you want to remove the following warning:
[W ParallelNative.cpp:206] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads)
run:
export OMP_NUM_THREADS=1