ModelTC

All

65 repositories

LightLLM
Public
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
nlp deep-learning llama gpt model-serving llm openai-triton
Python
•
Apache License 2.0
•291•3.8k•80•34•Updated Jan 6, 2026Jan 6, 2026
LightX2V
Public
Light Image Video Generation Inference Framework
video-generation diffusion-models wan-video auto-regressive-diffusion-model
Python
•
Apache License 2.0
•125•1.7k•102•0•Updated Jan 6, 2026Jan 6, 2026
LightTTS
Public
LightTTS is a lightweight TTS inference framework optimized for CosyVoice2 and CosyVoice3, enabling fast and scalable speech synthesis in Python and supports stream and bistream modes.
text-to-speech real-time tts speech-synthesis low-latency tensorrt inference-optimization audio-generation cosyvoice cosyvoice2
Python
•
Apache License 2.0
•3•19•1•0•Updated Jan 5, 2026Jan 5, 2026
mtc-incremental-bpe
Public
Incremental BPE tokenization for all prefixes
Rust
•
Apache License 2.0
•0•0•0•0•Updated Jan 5, 2026Jan 5, 2026
Qwen-Image-Lightning
Public
Qwen-Image-Lightning: Speed up Qwen-Image model with distillation
Python
•
Apache License 2.0
•42•1.1k•28•0•Updated Jan 1, 2026Jan 1, 2026
LightMem
Public
Python
•
Apache License 2.0
•0•3•0•0•Updated Dec 30, 2025Dec 30, 2025
modeltc.github.io
Public
HTML
•0•0•0•0•Updated Dec 29, 2025Dec 29, 2025
SageAttention
Public
Quantized Attention achieves speedup of 2-5x and 3-11x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and video models.
Cuda
•
Apache License 2.0
•304•2•0•0•Updated Dec 18, 2025Dec 18, 2025
general-sam-py
Public
Python bindings for general-sam and some utilities
Python
•
Apache License 2.0
•0•5•0•0•Updated Dec 15, 2025Dec 15, 2025
mtc-token-healing
Public
Token healing implementation in Rust
Rust
•
Apache License 2.0
•0•4•0•0•Updated Dec 15, 2025Dec 15, 2025
verl
Public
verl: Volcano Engine Reinforcement Learning for LLMs
Python
•
Apache License 2.0
•3k•1•0•0•Updated Dec 15, 2025Dec 15, 2025
ComfyUI-Lightx2vWrapper
Public
ComfyUI custom node for lightx2v
comfyui comfyui-nodes
Python
•
MIT License
•7•72•4•0•Updated Dec 11, 2025Dec 11, 2025
slime
Public
slime is an LLM post-training framework for RL Scaling.
Python
•
Apache License 2.0
•395•0•0•0•Updated Dec 8, 2025Dec 8, 2025
general-sam
Public
A general suffix automaton implementation in Rust with Python bindings
Rust
•
Apache License 2.0
•0•8•0•0•Updated Dec 1, 2025Dec 1, 2025
lightllm-blog
Public
SCSS
•
MIT License
•1•1•0•1•Updated Nov 26, 2025Nov 26, 2025
LightKernel
Public
HTML
•
Apache License 2.0
•0•3•0•0•Updated Nov 26, 2025Nov 26, 2025
greedy-tokenizer
Public
Greedily tokenize strings with the longest tokens iteratively.
Python
•
Apache License 2.0
•0•0•0•3•Updated Nov 24, 2025Nov 24, 2025
LightCompress
Public
[EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models including LLMs, VLMs, and video generative models.
benchmark deployment tool evaluation pruning quantization wan awq large-language-models llm
Python
•
Apache License 2.0
•66•652•40•0•Updated Nov 19, 2025Nov 19, 2025
lightx2v_examples
Public
0•0•0•0•Updated Nov 12, 2025Nov 12, 2025
Wan2.2-Lightning
Public
Wan2.2-Lightning: Speed up wan2.2 model with distillation
Python
•
Apache License 2.0
•1.6k•253•21•0•Updated Nov 7, 2025Nov 7, 2025
LTX-Video-Q8-Kernels
Public
Python
•16•0•0•0•Updated Nov 6, 2025Nov 6, 2025
SageAttention-1104
Public
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.
Cuda
•
Apache License 2.0
•304•0•0•0•Updated Nov 6, 2025Nov 6, 2025
FlashVSR
Public
Towards Real-Time Diffusion-Based Streaming Video Super-Resolution — An efficient one-step diffusion framework for streaming VSR with locality-constrained sparse attention and a tiny conditional decoder.
Python
•
Apache License 2.0
•99•1•0•0•Updated Nov 5, 2025Nov 5, 2025
ComfyUI-LightVAE
Public
Python
•
Apache License 2.0
•8•40•13•0•Updated Nov 3, 2025Nov 3, 2025
HBP
Public
[NeurIPS 2025] This is the official PyTorch implementation of "Hierarchical Balance Packing: Towards Efficient Supervised Fine-tuning for Long-Context LLM".
Python
•
Apache License 2.0
•0•4•0•0•Updated Sep 30, 2025Sep 30, 2025
TFMQ-DM
Public
[CVPR 2024 Highlight & TPAMI 2025] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models".
highlight quantization cvpr ldm diffusion-models tpami post-training-quantization ddim stable-diffusion cvpr2024
Jupyter Notebook
•
Apache License 2.0
•5•108•0•0•Updated Sep 29, 2025Sep 29, 2025
fa3
Public
Python
•
BSD 3-Clause "New" or "Revised" License
•1•0•0•0•Updated Aug 7, 2025Aug 7, 2025
flash-attn-3-build
Public
Dockerfile
•2•0•0•0•Updated Jul 24, 2025Jul 24, 2025
HarmoniCa
Public
[ICML 2025] This is the official PyTorch implementation of "🎵 HarmoniCa: Harmonizing Training and Inference for Better Feature Caching in Diffusion Transformer Acceleration".
acceleration icml dit pixart diffusion-models diffusion-transformer pixart-sigma feature-caching icml-2025
Python
•
Apache License 2.0
•1•44•0•0•Updated Jul 10, 2025Jul 10, 2025
OmniBal
Public
[ICML 2025] This is the official PyTorch implementation of "OmniBal: Towards Fast Instruction-Tuning for Vision-Language Models via Omniverse Computation Balance".
vlm icml-2025
Python
•
Apache License 2.0
•3•26•3•0•Updated Jun 16, 2025Jun 16, 2025