Skip to content

jrespeto/Local-LLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

84 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Local LLM

To use this repo, you need a system with a GPU, podman and nvidia container toolkit packages installed on your system. Review the podman documentation links below.

Testing your system

Before starting make sure podman containers have GPU access. Follow the instruction from the links below.

https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

https://podman-desktop.io/docs/podman/gpu

You also need to use podman-compose not docker-compose

podman-compose version >= 1.2.0

https://github.com/containers/podman-compose#installation

docker-compose with podman does not pass the GPU to the container.


Windows Setup

Use WSL2 to run podman. I recommend using Ubuntu 24.04.

To install WSL2 run the following command in a power shell or cmd terminal as an administrator. This will also install a Ubuntu vm.

wsl --install

Open a new WSL terminal and run the following command to start Ubuntu 24.04.

To install Ubuntu 24.04 in a wsl, run the following command. In a power shell or cmd terminal as an administrator run the following command.

wsl --install --name local-llm -d Ubuntu-24.04

To start Ubuntu 24.04 run the following command.

wsl -d local-llm

Warning

If you want to remove / uninstall the Ubuntu 24.04 distribution run the following command. This will also remove the volumes and any data / models.

wsl --unregister local-llm

Accessing the volumes from Windows is a bit tricky. You can use Explore to find the volumes under Linux on the left side of the window and browse the WSL filesystem for Ubuntu-24.04.

alt text

Volume are in the following path in the WSL filesystem. /home/USERNAME/.local/share/containers/storage/volumes/VOLUME-NAME Where USERNAME is your Windows username and VOLUME-NAME is the name of the volume.

alt text

Clone the repo

git clone https://github.com/jrespeto/Local-LLM.git

The version of podman-compose in the Ubuntu 24.04 repo is 1.0.6 and does not work with this compose file. You need to install the latest version of podman-compose from github using the following command as mentioned above. You also need to install python3-dotenv and the nvidia-container-toolkit using the following commands.

Warning

When updating your system drivers, you may need to regenerate the nvidia.yaml by rerunning the nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml below.

sudo su -
# podman-compose
curl -o /usr/local/bin/podman-compose https://raw.githubusercontent.com/containers/podman-compose/main/podman_compose.py
chmod +x /usr/local/bin/podman-compose

# nvidia-container-toolkit repo
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

# installing nvidia-container-toolkit and python3-dotenv
apt update && apt install -y nvidia-container-toolkit python3-dotenv podman

# generate cdi yaml file for the GPU
nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
nvidia-ctk cdi list

# logout of root
exit

Test NVIDIA GPU

Inside the WSL environment, run the following command to test the NVIDIA GPU:

podman run --rm --device nvidia.com/gpu=all docker.io/nvidia/cuda:12.8.0-runtime-ubuntu24.04 nvidia-smi

It should output data like this show details about your GPU.

Trying to pull docker.io/nvidia/cuda:12.8.0-base-ubuntu20.04...
Getting image source signatures
Copying blob 4b650590013c skipped: already exists
Copying blob d9802f032d67 skipped: already exists
Copying blob b4aeb03891f2 done   |
Copying blob a8163c471214 done   |
Copying blob e43bfa99a834 done   |
Copying config dd22839602 done   |
Writing manifest to image destination
Mon Mar 10 13:52:08 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.124.06             Driver Version: 570.124.06     CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4070 ...    Off |   00000000:01:00.0 Off |                  N/A |
| N/A   51C    P8              2W /   35W |      16MiB /   8188MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

Running the Compose File

To start the containers with the below command in the Running section in the WSL environment.


Tech Stack

This compose has 8 containers. To start all the containers, run the following command: --profile all

  • openWebUI - chatbot/ API's to ollama
  • ollama - LLM management
    • no ui ollama:11434
  • n8n - workflow automation platform with AI integrations and AI Agents
    • --profile n8n
    • postgresql - DB dep for Langflow
      • no ui postgres-n8n:5432
  • Langflow - low code framwork to build AI workflows and agents
  • searxng - local internet metasearch engine
  • comfyui - image generation flamwork
  • augmentoolkit - Create Custom Data for LLM Training
    • This is a toolkit for creates domain-expert datasets
    • --profile augmentoolkit
    • [localhost:5173] - ui
  • unsloth - Create Custom LLMs https://hub.docker.com/r/unsloth/unsloth
    • This is a 13G image for building LLMs.
    • data/augmentoolkits is also mounted in workspace
    • --profile unsloth
    • [localhost:8888/lab] default password is mypassword, changable in docker-compose-unsluth.yml line 13.

Common Containers

  • valkey - redis cache dep for searxng

    • no ui valkey:6379
    • redis db's
      • 0 - searxng
      • 1 - openwebui
      • 2 - n8n
      • 3 - augmentoolkit
      • 4 - langflow
  • qdrant - Vector store and AI RAGs


Running

To start the containers with the profile of the app.

podman-compose --profile openwebui --profile langflow up -d

To start up the containers and comfyui

First build the comfyui container

podman-compose --profile comfyui build

podman-compose --profile comfyui up -d

To follow the logs

podman-compose logs --follow

To stop

podman-compose down

To working with volums from the containers

podman volume ls - list all the volums

podman volume inspect volume_name - this lets you see the mount points

podman volume rm volume_name - this is to remove the volume

ComfyUI

You may need to update the FROM line in docker/dockerfile.comfyui for your systems version of cuda.

Watch the youtube links below on how to setup and use ComfyUI.

I do recommend joining the Pixaroma Discord

The entrypoint.sh for comfyui installs a few custom nodes listed below. Mostly from me following along the ComfyUI Tutorial Series from pixaroma.

Note

The 300+ Art styles are added here from EP.07

Note

TODO: add links to video next to node.

  • ComfyUI-Manager EP.01
  • was-node-suite-comfyui EP.07
  • ComfyUI-Easy-Use EP.07
  • ComfyUI-GGUF EP.10
  • ComfyUI-Crystools EP.10
  • rgthree-comfy EP.10
  • comfyui-ollama EP.13
  • ComfyUI_UltimateSDUpscale
  • comfyui_controlnet_aux
  • ComfyUI_Comfyroll_CustomNodes

openwebui

This will allow you to use comfyui to generate images with a default workflow.

ComfyUI is very advanced and can do way more then just image generation.

To intergrate ComfyUI with openwebUI you need to update the image setting under Admin settings.

alt text

If you update the "ComfyUI Workflow" you need to updated the "ComfyUI Workflow Nodes".

i.e - node 3 is the object with seed and steps

You also need to set a default Model. Watch the video's and playlist below to understand which models "depending on the ammount of vram on your GPU" and settings to use.

Reference Youtube

About

Meetup notes and lab files.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •