Skip to content

FastAPI server for RAG project. Exposes endpoint for calling LLM inference running on my PC with GPU's.

Notifications You must be signed in to change notification settings

brayway05/llm_api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

llm_api

FastAPI server for RAG project. Contains API for calling LLM inference running on my PC with GPU's.

Need a HUGGINGFACE_TOKEN in Dockerfile file to get access to llama 3.2 docker build --build-arg HUGGINGFACE_TOKEN="Token Here" -t llm-api .

docker run --name llm-api -p 5002:5002 llm-api

About

FastAPI server for RAG project. Exposes endpoint for calling LLM inference running on my PC with GPU's.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published