StreamDiffusionWebcam

StreamDiffusionWebcam — A solution for real-time diffusion-based image generation from a webcam feed.

Authors: Erenalp Çetintürk

This repository is a fork/extension of the original StreamDiffusion example that uses the system screen as input. This project adapts the pipeline to use a webcam (camera) as the input source so you can perform real-time image transformations using diffusion models.

Original project: cumulo-autumn/StreamDiffusion - examples/screen/main.py — many thanks to the original authors for the code and permissive usage.

Table of contents

Overview
Features
Requirements
Installation
Quick start / Usage
Configuration and CLI options
Performance & tuning tips
Troubleshooting
Contributing
Acknowledgements
License

Overview

StreamDiffusionWebcam captures frames from a webcam, converts them into tensors, and feeds them into a StreamDiffusion model in img2img mode to produce stylized/generated frames in (near) real time. Output frames are displayed in a viewer process.

Features

Webcam input (replaces original screen-capture example)
Uses StreamDiffusion for image-to-image diffusion generation
Supports acceleration options (xformers / TensorRT)
Frame buffering for batching
Prompt / negative-prompt customization via CLI
Simple viewer process to display generated frames and FPS

Requirements

Linux / Windows / macOS with a working webcam
NVIDIA GPU recommended for real-time performance (RTX series suggested)
Python 3.10 (tested with 3.10)
CUDA and NVIDIA drivers appropriate for your chosen PyTorch build
Conda recommended for environment management

Installation

Clone repository

git clone https://github.com/ErenalpCet/StreamDiffusionWebcam.git
cd StreamDiffusionWebcam

Create and activate a conda environment (recommended)

conda create -n webcamdiffusion python=3.10 -y
conda activate webcamdiffusion

Install the correct PyTorch build for your CUDA version. Choose the matching index URL for your system:

CUDA 11.8:

pip3 install torch==2.1.0 torchvision==0.16.0 xformers --index-url https://download.pytorch.org/whl/cu118

CUDA 12.1:

pip3 install torch==2.1.0 torchvision==0.16.0 xformers --index-url https://download.pytorch.org/whl/cu121

CUDA 12.4:

pip3 install torch==2.1.0 torchvision==0.16.0 xformers --index-url https://download.pytorch.org/whl/cu124

Install StreamDiffusion (and the optional tensorrt extras)

pip install git+https://github.com/cumulo-autumn/StreamDiffusion.git@main#egg=streamdiffusion[tensorrt]

(Optional) Install TensorRT helpers required for Tensorrt acceleration:

python -m streamdiffusion.tools.install-tensorrt

Install other dependencies:

pip install -r requirements.txt

Usage — Quick start

The main script uses python-fire for CLI arguments. The simplest way to run with default parameters:

python3 main.py

Example: run with a custom prompt and a specific model

python3 main.py --model_id_or_path="KBlueLeaf/kohaku-v2.1" \
  --prompt="a boy with black short hair, smiling, brown eyes, wearing glasses" \
  --negative_prompt="low quality, bad quality, blurry, low resolution" \
  --frame_buffer_size=1 --width=512 --height=512

Configuration and CLI options

All parameters available in main.py are exposed through the CLI via fire. Important options:

model_id_or_path (str): Model repo or local model path (default: "KBlueLeaf/kohaku-v2.1")
lora_dict (dict, optional): Dictionary of LoRA weights and scales, e.g. '{"lora_name":0.5}'
prompt (str): Positive prompt text
negative_prompt (str): Negative prompt text
frame_buffer_size (int): Frames batched for generation. 1 for real-time single-frame processing.
width, height (int): Output resolution (default 512x512)
acceleration (str): "none", "xformers", or "tensorrt"
use_denoising_batch (bool): Use denoising batches (True/False)
seed (int): Random seed
cfg_type (str): "none", "full", "self", "initialize" (as used by StreamDiffusionWrapper)
guidance_scale (float): Guidance scale for the model
delta (float): Delta parameter used by StreamDiffusionWrapper
do_add_noise (bool): Add noise to inputs
enable_similar_image_filter (bool): Skip generation if image is too similar to previous frame
similar_image_filter_threshold (float)
similar_image_filter_max_skip_frame (int)

To view all parameters and defaults:

python3 main.py --help

Notes on the code

main.py spawns a process to run the StreamDiffusion model, a process to set the monitor size, and a viewer process to receive and display generated frames.
The webcam capture runs in a thread in the image generation process and places tensor frames into a global inputs list consumed by the generator.
The script uses multiprocessing with 'spawn' context to improve cross-platform compatibility.

Performance & tuning tips

Recommended GPU: NVIDIA RTX 30/40 series. CPU-only will be very slow.
For best throughput, use xformers if available (acceleration="xformers"). Ensure xformers matches your torch build.
TensorRT can give much better performance but requires correct setup and may need additional tuning.
Lower resolution (e.g., 384x384) and smaller frame_buffer_size can increase FPS.
Increase frame_buffer_size to batch multiple frames if memory and model support it.
Warmup steps are used in the wrapper; initial frames may be slower.
If you want consistent results across runs, set seed.

Troubleshooting

Webcam not detected:
- Ensure webcam drivers are installed.
- Check device index in cv2.VideoCapture(0); try other indices (1, 2...).
- On Linux, ensure the user has permissions for /dev/video* (use v4l2-ctl --list-devices).
Low FPS:
- Verify GPU is used (nvidia-smi).
- Try lowering resolution or switching acceleration modes.
- Ensure xformers and torch are compatible and installed correctly.
Torch / CUDA errors:
- Match your torch wheel to your CUDA and OS. Reinstall the appropriate wheel if needed.
Tensorrt errors:
- Ensure TensorRT and its Python bindings are installed and compatible with your libcudnn and CUDA.
Python warnings from upstream libs:
- The script suppresses several FutureWarning/UserWarning lines — these are usually harmless.

Security & privacy

Webcam input is processed locally — no webcam frames are uploaded by the script by default.
If you modify the code to log or transmit frames, be mindful of privacy and legal considerations.

Contributing

Contributions are welcome. Please raise issues for problems or feature requests.
If you propose code changes, follow standard GitHub fork/branch/PR workflow.

Acknowledgements

This project is based on and adapted from the StreamDiffusion repository and example code (https://github.com/cumulo-autumn/StreamDiffusion). Huge thanks to the authors for their work and permissive sharing of code.
Many thanks to the open-source community for PyTorch, xformers, diffusers, and related tooling.

License

This repository inherits licenses from the upstream StreamDiffusion project and any models you download. Be sure to check license terms for any model weights you use.
Include your chosen license file in the repo if you want to specify a license explicitly (MIT, Apache-2.0, etc.).

Contact

Author: Erenalp Çetintürk
GitHub: https://github.com/ErenalpCet

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
utils		utils
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

StreamDiffusionWebcam

Overview

Features

Requirements

Installation

Usage — Quick start

Configuration and CLI options

Notes on the code

Performance & tuning tips

Troubleshooting

Security & privacy

Contributing

Acknowledgements

License

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

ErenalpCet/StreamDiffusionWebcam

Folders and files

Latest commit

History

Repository files navigation

StreamDiffusionWebcam

Overview

Features

Requirements

Installation

Usage — Quick start

Configuration and CLI options

Notes on the code

Performance & tuning tips

Troubleshooting

Security & privacy

Contributing

Acknowledgements

License

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages