Skip to content

Conversation

@nagisa
Copy link
Member

@nagisa nagisa commented Jan 13, 2026

This is the conceptual opposite of the rust-cold calling convention and is particularly useful in combination with the new explicit_tail_calls feature.

For relatively tight loops implemented with tail calling (become) each of the function with the regular calling convention is still responsible for restoring the initial value of the preserved registers. So it is not unusual to end up with a situation where each step in the tail call loop is spilling and reloading registers, along the lines of:

foo:
    push r12
    ; do things
    pop r12
    jmp next_step

This adds up quickly, especially when most of the clobberable registers are already used to pass arguments or other uses.

I was thinking of making the name of this ABI a little less LLVM-derived and more like a conceptual inverse of rust-cold, but could not come with a great name (rust-cold is itself not a great name: cold in what context? from which perspective? is it supposed to mean that the function is rarely called?)

@rustbot
Copy link
Collaborator

rustbot commented Jan 13, 2026

This PR changes rustc_public

cc @oli-obk, @celinval, @ouz-a, @makai410

rust-analyzer is developed in its own repository. If possible, consider making this change to rust-lang/rust-analyzer instead.

cc @rust-lang/rust-analyzer

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-rust-analyzer Relevant to the rust-analyzer team, which will review and decide on the PR/issue. labels Jan 13, 2026
@rustbot
Copy link
Collaborator

rustbot commented Jan 13, 2026

r? @jackh726

rustbot has assigned @jackh726.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@nagisa

This comment was marked as outdated.

@nagisa nagisa force-pushed the add-preserve-none-abi branch from ecf300a to 6d27ba9 Compare January 13, 2026 15:05
@rust-log-analyzer

This comment has been minimized.

@folkertdev
Copy link
Contributor

This seems somewhat related: I believe that there is no way to make tail calls work with ABIs that pass values using PassMode::Indirect { on_stack: false } #144855 (comment).

This calling convention is probably still useful, but adding another calling convention that uses the (unstable!) rust ABI won't actually help with moving tail call support forward.

So maybe there is a possible design of an extern "tail-call" that would combine this preserve-none feature with a guarantee that PassMode::Indirect { on_stack: false } will not be used.

Also for completeness, using PassMode::Indirect { on_stack: true } is currently buggy on many platforms, notably x86 and riscv. So that will be a blocker anyhow for guaranteed tail calls, for at least another LLVM release cycle.

@nagisa
Copy link
Member Author

nagisa commented Jan 13, 2026

I don't intend this PR to be a solution to the indirect argument passing problem, although I see how having a calling convention with more registers available would help with indirectly passed arguments.

My primary incentive is really just performance – spilling registers in the top level caller once is much more efficient than spilling and reloading on every jump. Normally I would probably look for a solution that is transparent to the user, but unfortunately in this case the decision is callee's and it cannot know if it'll be become'd or called regularly in the general case.

That reminds me, I guess I'll have to verify if I don't need to change the abi computation code here...

@folkertdev
Copy link
Contributor

I should update this with that I was wrong, and there does appear to be a way to support PassMode::Indirect { on_stack: false, .. }. There is an experimental implementation in #151143. As mentioned the rust calling convention really likes passing values with that PassMode, and with that implementation and advantage it has is that rust is fully in control of the implementation, so there are no issues with flaky LLVM backend support for byval (which is used for on_stack: true).

Also you may have seen this but there was some recent discussion on better tail call calling conventions recently:

So adding this calling convention as an experiment seems useful, and modulo the test formatting issue it appears to work?

@rust-log-analyzer

This comment has been minimized.

@nagisa nagisa force-pushed the add-preserve-none-abi branch from fd925bf to 8631e52 Compare January 20, 2026 11:27
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@nagisa nagisa force-pushed the add-preserve-none-abi branch from f08a413 to 940b4d7 Compare January 20, 2026 15:54
@rust-log-analyzer

This comment has been minimized.

@nagisa nagisa force-pushed the add-preserve-none-abi branch from 940b4d7 to 664c1fc Compare January 21, 2026 12:38
@rustbot
Copy link
Collaborator

rustbot commented Jan 21, 2026

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

@bjorn3
Copy link
Member

bjorn3 commented Jan 21, 2026

Ideally for both rust-cold and rust-preserve-none there would be a check that prevents cross-backend calls that use those abis. And there would be a lint to avoid using them in the standard library as rustup distributed cg_clif uses a cg_llvm compiled standard library.

@rust-log-analyzer

This comment has been minimized.

@nagisa
Copy link
Member Author

nagisa commented Jan 21, 2026

Ideally for both rust-cold and rust-preserve-none there would be a check that prevents cross-backend calls that use those abis. And there would be a lint to avoid using them in the standard library as rustup distributed cg_clif uses a cg_llvm compiled standard library.

I wouldn't object to such a check, but at the same time, is this really necessary for an unstable feature that's added chiefly to enable experimentation (in my case with explicit_tail_call?) We can figure these issues as we work out stabilization of it.

Lint is probably not too hard to add though… I feel like it could even be deny-by-default and unconditional, and if you want to use for experiments, you allow it. But at the same time, that's exactly what feature gate does already, no?

@nagisa nagisa force-pushed the add-preserve-none-abi branch from 664c1fc to b46772b Compare January 21, 2026 13:27
@bjorn3
Copy link
Member

bjorn3 commented Jan 21, 2026

I'm fine with landing this PR without such checks. Doesn't make things that much worse as we already have rust-cold.

@rust-log-analyzer

This comment has been minimized.

@rust-bors rust-bors bot added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 23, 2026
JonathanBrouwer added a commit to JonathanBrouwer/rust that referenced this pull request Jan 23, 2026
…trochenkov

abi: add a rust-preserve-none calling convention

This is the conceptual opposite of the rust-cold calling convention and is particularly useful in combination with the new `explicit_tail_calls` feature.

For relatively tight loops implemented with tail calling (`become`) each of the function with the regular calling convention is still responsible for restoring the initial value of the preserved registers. So it is not unusual to end up with a situation where each step in the tail call loop is spilling and reloading registers, along the lines of:

    foo:
        push r12
        ; do things
        pop r12
        jmp next_step

This adds up quickly, especially when most of the clobberable registers are already used to pass arguments or other uses.

I was thinking of making the name of this ABI a little less LLVM-derived and more like a conceptual inverse of `rust-cold`, but could not come with a great name (`rust-cold` is itself not a great name: cold in what context? from which perspective? is it supposed to mean that the function is rarely called?)
rust-bors bot pushed a commit that referenced this pull request Jan 23, 2026
…uwer

Rollup of 3 pull requests

Successful merges:

 - #150556 (Add Tier 3 Thumb-mode targets for Armv7-A, Armv7-R and Armv8-R)
 - #151065 (abi: add a rust-preserve-none calling convention)
 - #151505 (Various refactors to the proc_macro bridge)

r? @ghost
@JonathanBrouwer
Copy link
Contributor

@bors r-
#151535 (comment)

@rust-bors rust-bors bot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jan 23, 2026
@rust-bors
Copy link
Contributor

rust-bors bot commented Jan 23, 2026

Commit e1ec059 has been unapproved.

@nagisa nagisa force-pushed the add-preserve-none-abi branch from e1ec059 to 18b8ea2 Compare January 24, 2026 00:51
@nagisa
Copy link
Member Author

nagisa commented Jan 24, 2026

Hm, looks like LLVM doesn't support anything other than x86_64 or aarch64 for this ABI. I opted to emit a default ABI for targets outside of these for now...

For what it is worth PreserveMost (used for "rust-cold") also has this problem, although the target support for that one is a little wider…

I also prototyped a PR for upstream LLVM at llvm/llvm-project#177714 while investigating the crash…

@rust-log-analyzer

This comment has been minimized.

@nagisa nagisa force-pushed the add-preserve-none-abi branch from 18b8ea2 to fdf0fbd Compare January 24, 2026 14:08
@rust-log-analyzer

This comment has been minimized.

This is the conceptual opposite of the rust-cold calling convention and
is particularly useful in combination with the new `explicit_tail_calls`
feature.

For relatively tight loops implemented with tail calling (`become`) each
of the function with the regular calling convention is still responsible
for restoring the initial value of the preserved registers. So it is not
unusual to end up with a situation where each step in the tail call loop
is spilling and reloading registers, along the lines of:

    foo:
        push r12
        ; do things
        pop r12
        jmp next_step

This adds up quickly, especially when most of the clobberable registers
are already used to pass arguments or other uses.

I was thinking of making the name of this ABI a little less LLVM-derived
and more like a conceptual inverse of `rust-cold`, but could not come
with a great name (`rust-cold` is itself not a great name: cold in what
context? from which perspective? is it supposed to mean that the
function is rarely called?)
@nagisa nagisa force-pushed the add-preserve-none-abi branch from fdf0fbd to 6db94db Compare January 24, 2026 17:23
@nagisa
Copy link
Member Author

nagisa commented Jan 24, 2026

@bors r=petrochenkov rollup=never

@rust-bors
Copy link
Contributor

rust-bors bot commented Jan 24, 2026

📌 Commit 6db94db has been approved by petrochenkov

It is now in the queue for this repository.

@rust-bors rust-bors bot added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jan 24, 2026
@Zalathar
Copy link
Member

Scheduling: Encourage a mixture of rollup and non-rollup PRs.

@bors p=5

@rust-bors

This comment has been minimized.

@rust-bors rust-bors bot added merged-by-bors This PR was explicitly merged by bors. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jan 25, 2026
@rust-bors
Copy link
Contributor

rust-bors bot commented Jan 25, 2026

☀️ Test successful - CI
Approved by: petrochenkov
Duration: 3h 12m 59s
Pushing 75963ce to main...

@rust-bors rust-bors bot merged commit 75963ce into rust-lang:main Jan 25, 2026
12 checks passed
@rustbot rustbot added this to the 1.95.0 milestone Jan 25, 2026
@github-actions
Copy link
Contributor

What is this? This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 5a07626 (parent) -> 75963ce (this PR)

Test differences

Show 14 test diffs

Stage 1

  • [ui] tests/ui/abi/rust-preserve-none-cc.rs: [missing] -> pass (J2)
  • [ui] tests/ui/feature-gates/feature-gate-rust-preserve-none-cc.rs: [missing] -> pass (J2)
  • [codegen] tests/codegen-llvm/preserve-none.rs#AARCH64: [missing] -> pass (J3)
  • [codegen] tests/codegen-llvm/preserve-none.rs#UNSUPPORTED: [missing] -> pass (J3)
  • [codegen] tests/codegen-llvm/preserve-none.rs#X86: [missing] -> pass (J3)

Stage 2

  • [ui] tests/ui/abi/rust-preserve-none-cc.rs: [missing] -> pass (J0)
  • [ui] tests/ui/feature-gates/feature-gate-rust-preserve-none-cc.rs: [missing] -> pass (J0)
  • [codegen] tests/codegen-llvm/preserve-none.rs#AARCH64: [missing] -> pass (J1)
  • [codegen] tests/codegen-llvm/preserve-none.rs#UNSUPPORTED: [missing] -> pass (J1)
  • [codegen] tests/codegen-llvm/preserve-none.rs#X86: [missing] -> pass (J1)

Additionally, 4 doctest diffs were found. These are ignored, as they are noisy.

Job group index

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard 75963ce795666bc1f961e5d60060809809f6bc68 --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

  1. dist-powerpc64-linux-musl: 8851.5s -> 5158.0s (-41.7%)
  2. dist-apple-various: 4609.3s -> 3465.1s (-24.8%)
  3. dist-x86_64-musl: 7414.3s -> 8458.8s (+14.1%)
  4. x86_64-gnu-aux: 8507.3s -> 7500.6s (-11.8%)
  5. x86_64-gnu-nopt: 8642.8s -> 7717.2s (-10.7%)
  6. x86_64-gnu-llvm-20: 4726.5s -> 4249.4s (-10.1%)
  7. pr-check-2: 2432.2s -> 2642.6s (+8.6%)
  8. aarch64-apple: 8535.9s -> 9215.2s (+8.0%)
  9. x86_64-msvc-ext1: 7319.7s -> 7898.9s (+7.9%)
  10. dist-x86_64-solaris: 5025.0s -> 5421.5s (+7.9%)
How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (75963ce): comparison URL.

Overall result: ❌ regressions - no action needed

@rustbot label: -perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.0% [0.0%, 0.0%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Max RSS (memory usage)

This benchmark run did not return any relevant results for this metric.

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results (primary 0.0%, secondary 0.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.0% [0.0%, 0.0%] 4
Regressions ❌
(secondary)
0.0% [0.0%, 0.0%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.0% [0.0%, 0.0%] 4

Bootstrap: 472.363s -> 471.746s (-0.13%)
Artifact size: 383.56 MiB -> 383.60 MiB (0.01%)

@nagisa nagisa deleted the add-preserve-none-abi branch January 25, 2026 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. merged-by-bors This PR was explicitly merged by bors. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-rust-analyzer Relevant to the rust-analyzer team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.