CleanvLLM

A single-file educational implementation for understanding vLLM's core concepts and running LLM inference.

🎯 Purpose - Why This Matters

Learn AI Infrastructure Fundamentals: This project provides a clean, educational implementation of vLLM's core concepts in a single Python file, making it easy to understand how modern LLM inference engines work under the hood.

Perfect for Learning: Whether you're a student, researcher, or engineer wanting to understand vLLM internals, this simplified implementation helps you grasp the fundamental concepts without getting lost in production complexity.

🚀 Quick Start - Run vLLM Inference in 3 Steps

# 1. Create and activate conda environment
conda create -n cleanvllm python=3.10 -y && conda activate cleanvllm

# 2. Install dependencies
pip install -r requirements.txt

# 3. Run vLLM inference
python qwen3_0_6B.py

That's it! You're now running vLLM inference!

📖 Detailed Usage

Basic Usage

Update the model path in qwen3_0_6B.py:

path = os.path.expanduser("~/path/to/your/qwen3model")

Run the script:

python qwen3_0_6B.py

🚧 TODO List

Upcoming Features

qwen3_30B_A3B.py: Support for larger Qwen3-30B-A3B model
Multi-GPU Support: Enhanced tensor parallelism for distributed inference
More Model Variants: Support for additional Qwen model sizes and configurations
Performance Optimizations: Further kernel optimizations and memory efficiency improvements

Current Support

qwen3_0_6B.py: Complete implementation for Qwen3-0.6B model
Basic vLLM Features: PagedAttention, KV caching, continuous batching
Flash Attention: Auto-detection and fallback support

🙏 Acknowledgments

This project is inspired by and based on the concepts from vLLM, a high-throughput and memory-efficient inference and serving engine for LLMs. We are grateful to the vLLM team and community for their pioneering work in LLM inference optimization.

Also based on the excellent nano-vLLM project. Thanks to the original authors for their outstanding work!

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
LICENSE		LICENSE
README.md		README.md
qwen3_0_6B.py		qwen3_0_6B.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CleanvLLM

🎯 Purpose - Why This Matters

🚀 Quick Start - Run vLLM Inference in 3 Steps

📖 Detailed Usage

Basic Usage

🚧 TODO List

Upcoming Features

Current Support

🙏 Acknowledgments

📚 Star History

About

Uh oh!

Releases

Packages

Languages

License

amulil/cleanvllm

Folders and files

Latest commit

History

Repository files navigation

CleanvLLM

🎯 Purpose - Why This Matters

🚀 Quick Start - Run vLLM Inference in 3 Steps

📖 Detailed Usage

Basic Usage

🚧 TODO List

Upcoming Features

Current Support

🙏 Acknowledgments

📚 Star History

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages