Skip to content

Conversation

@dependabot
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Jan 6, 2026

Bumps torchao from 0.12.0 to 0.15.0.

Release notes

Sourced from torchao's releases.

v0.15.0

Highlights

We are excited to announce the 0.15.0 release of torchao! This release adds:

  • MXFP8 MoE training demonstrates 1.2x e2e training speedup with identical convergence versus bf16, training Llama4 Scout on a 64 node GB200 Crusoe cluster!
  • MXFP8 MoE kernels shipped with torchao builds for CUDA 12.8+ (just pip install instead of building from source to use!)
  • Safetensors enablement
  • Quantization with parameter level targeting

MXFP8 MoE training demonstrates 1.2x e2e training speedup with identical convergence versus bf16, training Llama4 Scout on a 64 node GB200 Crusoe cluster

Training runs on 64 node GB200 cluster with TorchTitan Llama4 Scout demonstrated a 1.2x e2e training speedup with equivalent convergence to bfloat16 training baseline. In fact, after 3,000 steps it finishes with slightly lower loss than bfloat16! This is consistent with our scaling experiments with MXFP8 training for dense models.

Number of GPUs BF16 tokens/sec MXFP8 tokens/sec MXFP8 speedup vs BF16
512 6169 7401 1.20x

See the TorchAO MXFP8 MoE training documentation for more details. You can also check out the TorchTitan MXFP8 documentation to run pretraining jobs with TorchAO MXFP8 by adding a single config.

Safetensors Enablement

You can now save and load TorchAO model checkpoints using safetensors! This feature is integrated with Hugging Face transformers starting from v5.0.0 and vLLM 0.13.0 for model inference/serving.

We currently support the following stable configs: Float8DynamicActivationFloat8WeightConfig Int4WeightOnlyConfig IntxWeightOnlyConfig Int8DynamicActivationIntxWeightConfig Int8WeightOnlyConfig Int8DynamicActivationInt8WeightConfig

and will continue to add support for configs as they become stable in the future.

Example:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TorchAoConfig
from torchao.quantization import Float8WeightOnlyConfig
model_id = "facebook/opt-125m"
quant_config = Float8WeightOnlyConfig()
quantization_config = TorchAoConfig(quant_type=quant_config)
quantized_model = AutoModelForCausalLM.from_pretrained(
model_id,
</tr></table>

... (truncated)

Commits
  • 9338966 use python version agnostic binding for mxfp8 cuda kernels (#3471)
  • acc9103 Fix NVFP4 QAT backward typo (#3478)
  • 286c2d8 Fix NVFP4 QAT convert path (#3450)
  • 924d6c0 update version compatibility table (#3455)
  • aa21b80 skip certain mxfp8 tests for cuda < 12.8 (#3443)
  • 69ce0fd [Intel GPU] Enable optim SR test (#3055)
  • 70e903b [xpu][test] Port 2 test/quantization/pt2e/test_{quantize_pt2e, quantize_pt2e_...
  • 1272f3c [xpu][test] Port 2 test/dtypes_{floatx, bitpacking} UT files to intel XPU (#3...
  • c4273fe Int8Tensor migration cleanup (#3407)
  • 7e0d439 [CPU] Reland qconv fp8 fusion passes (#3433)
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [torchao](https://github.com/pytorch/ao) from 0.12.0 to 0.15.0.
- [Release notes](https://github.com/pytorch/ao/releases)
- [Commits](pytorch/ao@v0.12.0...v0.15.0)

---
updated-dependencies:
- dependency-name: torchao
  dependency-version: 0.15.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant