This guide walks you through setting up Ollama on Windows to run the DeepSeek-R1:7B model with a local FastAPI interface.
- Windows with Docker Desktop installed
- NVIDIA GPU (for GPU acceleration)
- WSL 2 Enabled (if running Docker on Windows)
- Python 3.12 (for the FastAPI interface)
Run the following command to verify that Docker is active:
docker psStart Ollama in a Dockerized environment:
docker-compose up -dPull the DeepSeek-R1:7B model into Ollama:
docker exec -it ollama ollama pull deepseek-r1:7bRun the model inside the Ollama container:
docker exec -it ollama ollama run deepseek-r1:7bYou can manually test DeepSeek via Ollama's API:
curl -X POST http://localhost:11434/api/generate -d '{
"model": "deepseek-r1:7b",
"prompt": "What is DeepSeek?",
"stream": false
}'If you don’t have Conda, install Miniconda.
Then, create a new environment for FastAPI:
conda create -n ds-ollama python=3.12Activate the environment:
conda activate ds-ollamaInside the activated environment, install the required Python packages:
pip install fastapi uvicorn requests jinja2To launch the web interface for DeepSeek:
-
Ensure Ollama is running:
docker-compose up -d
-
Run FastAPI locally:
uvicorn main:app --host 0.0.0.0 --port 8000 --reload
-
Open the chat interface in your browser:
To stop Ollama, run:
docker-compose downIf you need to restart the FastAPI server:
uvicorn main:app --host 0.0.0.0 --port 8000 --reloadNow you have Ollama running DeepSeek-R1 7B inside Docker and a FastAPI-powered ChatGPT-style interface on Windows! 🚀
Need Help? 🤔
If you run into any issues, check:
- Docker logs:
docker logs ollama - FastAPI logs: Look at the terminal where FastAPI is running
Made with ❤️.