BioSyn AI: Repurposing Life

"Repurposing Life through Geometric Deep Learning."

BioSyn AI is an end-to-end generative pipeline designed to imagine novel drug candidates (ligands) that bind to specific protein targets. It combines E(n)-Equivariant Graph Neural Networks (GNNs) for protein structure encoding with 3D Denoising Diffusion Probabilistic Models (DDPMs) for molecule generation.

🧬 Architecture

The pipeline follows a closed-loop generative process:

graph LR
    A[Protein PDB] -->|Ingestion Engine| B(Geometric Graph)
    B -->|GNN Encoder| C[Context Embedding]
    D[Gaussian Noise] -->|Diffusion Model| E{Reverse Process}
    C --> E
    E -->|Denoising| F[3D Atom Cloud]
    F -->|KNN Builder| G[SMILES Candidate]

Ingestion: TypeScript engine fetches raw PDB/SDF files from biological databases.
Encoder: A GNN extracts geometric features (invariant to rotation/translation) from the protein pocket.
Decoder: A Diffusion model iteratively refines random noise into stable 3D molecular structures conditioned on the protein embedding.
Inference: A robust MoleculeBuilder reconstructs valid chemical graphs from 3D point clouds using K-Nearest Neighbors (KNN) logic.

⚡ Quick Start

Prerequisites

Python 3.10+
Node.js (v16+)
CUDA-enabled GPU (Recommended)

1. Installation

Clone the repository and set up the hybrid environment.

# Clone the repo
git clone [https://github.com/zumermalik/BioSyn-AI-Repurposing-Life.git](https://github.com/zumermalik/BioSyn-AI-Repurposing-Life.git)
cd BioSyn-AI-Repurposing-Life

# Set up Python Environment (Conda recommended for RDKit compatibility)
conda create -n biosyn python=3.10 -y
conda activate biosyn

# Install Core Dependencies
pip install -r requirements.txt

# Install Ingestion Engine (TypeScript)
npm install

2. Run the Pipeline (Zero to Hero)

You can run the entire inference stack with a single command. This will load the pre-trained checkpoint and generate candidates for the target protein 5R82.

# Run Inference
python src/pipeline/inference_pipeline.py

Expected Output:

🧪 Starting BioSyn Inference on cuda...
   >> Target Protein: 5R82.pdb
   >> Loading checkpoint: checkpoints/biosyn_epoch_5.pt
   >> Generating 5 drug candidates...
      🔹 Candidate 1: CC(=O)Nc1ccc(O)cc1
      🔹 Candidate 2: CN1C=NC2=C1C(=O)N(C(=O)N2C)C
✅ Generation Complete. 5 candidates saved to results/

📂 Project Structure

BioSyn-AI-Repurposing-Life/
├── configs/              # Hyperparameter Configuration
├── data/                 # Data Storage
│   ├── external/         # External Databases (PDBBind/CrossDocked)
│   ├── processed/        # PyTorch Geometric Tensors
│   └── raw/              # Original PDB/SDF Files
├── notebooks/            # Jupyter Prototyping Environments
├── src/                  # Source Code
│   ├── chemistry/        # RDKit Logic & Molecule Builders
│   ├── ingestion/        # TypeScript/Python Data Fetchers
│   ├── models/           # GNN Encoder & Diffusion Decoder
│   ├── pipeline/         # Training & Inference Orchestration
│   ├── utils/            # Utility Functions
│   ├── __init__.py       # Package Initialization
│   └── main.py           # Main Application Entry Point
├── tests/                # Unit Tests
├── environment.yml       # Conda Environment Definition
├── LICENSE               # Apache 2.0 License
├── package.json          # Node.js Dependencies
├── pyproject.toml        # Python Packaging Configuration
├── README.md             # Project Documentation
├── requirements.txt      # Python Dependencies
├── ROADMAP.md            # Future Development Plans
└── tsconfig.json         # TypeScript Configuration

🛠️ Development & Testing

We use pytest for unit testing the geometric logic and chemical validity.

# Run the full test suite
pytest tests/

🤝 Contributing

Contributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

Contribution Standards

Code Style: Please use black for Python formatting.
Testing: Ensure all new modules have accompanying tests in tests/.
Data: Do not commit large datasets (PDB/SDF files) to Git. Use the data/ folder.

📜 Citation & License

This project is licensed under the Apache 2.0 License. If you use this architecture in your research, please link back to this repository.

Maintained by the Builders.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

BioSyn AI: Repurposing Life

🧬 Architecture

⚡ Quick Start

Prerequisites

1. Installation

2. Run the Pipeline (Zero to Hero)

📂 Project Structure

🛠️ Development & Testing

🤝 Contributing

Contribution Standards

📜 Citation & License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
configs		configs
notebooks		notebooks
src		src
tests		tests
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
environment.yml		environment.yml
package.json		package.json
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
tsconfig.json		tsconfig.json

License

zumermalik/BioSyn-AI-Repurposing-Life

Folders and files

Latest commit

History

Repository files navigation

BioSyn AI: Repurposing Life

🧬 Architecture

⚡ Quick Start

Prerequisites

1. Installation

2. Run the Pipeline (Zero to Hero)

📂 Project Structure

🛠️ Development & Testing

🤝 Contributing

Contribution Standards

📜 Citation & License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages