Add benchmarks for `reinterpret` performance for both primitives and structs with padding. #339

NHDaly · 2025-12-18T19:52:29Z

This adds benchmarks for reinterpret performance, which is an important low-level primitive operation in Julia.

It includes benchmarks for both reinterpret on primitives, which we found to be unexpectedly slow in 1.10 but then was improved again in JuliaLang/julia@cf34aa2.

It also includes benchmarks for the new support for reinterpreting between arbitrary padded structs as long as their packedsizes are the same. We have found opportunities to improve performance for those structs, and I'll be opening a PR on julia for that. :)

These are the results for this new benchmark on julia +nightly today -- note that many of these are in ms, rather than μs:

julia> versioninfo()
Julia Version 1.14.0-DEV.1386
Commit 3b21c7f60d1 (2025-12-18 16:29 UTC)
Build Info:
  Official https://julialang.org release
Platform Info:
  OS: macOS (arm64-apple-darwin24.0.0)
  CPU: 12 × Apple M2 Max
  WORD_SIZE: 64
  LLVM: libLLVM-20.1.8 (ORCJIT, apple-m2)
  GC: Built with stock GC
Threads: 1 default, 1 interactive, 1 GC (on 8 virtual cores)
Environment:
  JULIA_SSL_CA_ROOTS_PATH = 

julia> BaseBenchmarks.load!("reinterpret"); run(BaseBenchmarks.SUITE["reinterpret"])
4-element BenchmarkTools.BenchmarkGroup:
  tags: []
  "mixed_tuples" => 4-element BenchmarkTools.BenchmarkGroup:
          tags: []
          (104, 104) => Trial(16.791 μs)
          (228, 228) => Trial(31.916 μs)
          (0, 0) => Trial(4.166 μs)
          (100, 100) => Trial(17.209 μs)
  "padded_to_padded" => 6-element BenchmarkTools.BenchmarkGroup:
          tags: []
          (29, 48, 56) => Trial(3.970 ms)
          (10, 24, 24) => Trial(9.041 μs)
          (29, 56, 48) => Trial(3.756 ms)
          (117, 128, 128) => Trial(17.475 ms)
          (10, 12, 24) => Trial(131.583 μs)
          (0, 0, 0) => Trial(4.250 μs)
  "packed_types" => 5-element BenchmarkTools.BenchmarkGroup:
          tags: []
          17 => Trial(9.459 μs)
          49 => Trial(12.708 μs)
          8 => Trial(12.042 μs)
          128 => Trial(19.292 μs)
          0 => Trial(4.125 μs)
  "padded_types" => 3-element BenchmarkTools.BenchmarkGroup:
          tags: []
          (29, 56) => Trial(3.720 ms)
          (10, 24) => Trial(73.292 μs)
          (117, 128) => Trial(99.875 μs)

EDIT: This benchmark is used in the upstream Julia PR here: JuliaLang/julia#60415

NHDaly added 4 commits December 13, 2025 21:27

Beginnings of Reinterpret Benchmark

8680a59

Decent set of reinterpret benchmarks

b6600e1

Add empty benchmarks

e1074f9

Reduce redundant benchmark cases

34a16b6

NHDaly mentioned this pull request Dec 18, 2025

Improve reinterpret performance for padded types, with minimal harm to compilation time JuliaLang/julia#60415

Open

NHDaly added 2 commits December 18, 2025 14:27

32-bit machine

12117c9

add more 32-bit types

d60366f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add benchmarks for `reinterpret` performance for both primitives and structs with padding. #339

Add benchmarks for `reinterpret` performance for both primitives and structs with padding. #339

NHDaly commented Dec 18, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add benchmarks for reinterpret performance for both primitives and structs with padding. #339

Are you sure you want to change the base?

Add benchmarks for reinterpret performance for both primitives and structs with padding. #339

Conversation

NHDaly commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add benchmarks for `reinterpret` performance for both primitives and structs with padding. #339

Add benchmarks for `reinterpret` performance for both primitives and structs with padding. #339

NHDaly commented Dec 18, 2025 •

edited

Loading