-
-
Notifications
You must be signed in to change notification settings - Fork 14.4k
Update to LLVM 22 #150722
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Update to LLVM 22 #150722
Conversation
|
These commits modify compiler targets. |
|
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (dabe9cd): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 1.0%, secondary -1.3%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary 3.2%, secondary 2.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary -0.6%, secondary -1.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 473.133s -> 480.877s (1.64%) |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment has been minimized.
This comment has been minimized.
|
☔ The latest upstream changes made this pull request unmergeable. Please resolve the merge conflicts. |
|
☔ The latest upstream changes (presumably #150726) made this pull request unmergeable. Please resolve the merge conflicts. |
|
Based on helloworld, the perf regressions seem to be related to the allocator somehow. Previously tcache_alloc_small_hard was called 3 times, now it's called 189 times. The total number of allocations is smaller, but the time spent in the allocator is larger. |
|
Or maybe the issue is not actually the allocator behavior itself. I suspect that we might have lost LTO on jemalloc and tcache_alloc_small_hard previously got inlined into malloc_default but now no longer is. Possibly updating the host toolchain at the same time so that the versions match will help. |
This comment has been minimized.
This comment has been minimized.
|
@bors try jobs=aarch64-msvc-1 |
This comment has been minimized.
This comment has been minimized.
Update to LLVM 22 try-job: aarch64-msvc-1
|
💔 Test for c6faa0e failed: CI. Failed job:
|
This comment has been minimized.
This comment has been minimized.
|
@bors try jobs=aarch64-msvc-1 |
This comment has been minimized.
This comment has been minimized.
Update to LLVM 22 try-job: aarch64-msvc-1
This comment has been minimized.
This comment has been minimized.
|
Yay, with that all issues should have a pending patch. I looked into the 4% regression on include-blob, and am somewhat confused. It looks like we're spending 170M instructions in But I'm not sure how we can end up with something like this. This is ultimately just a |
|
Oh, I think this may be related to the fact that we're using an old libstdc++ 9.5. It looks like that version does not specialize to memmove if the input and output types of the iterators aren't the same: https://cpp.godbolt.org/z/x7cMb1shT In this case there is a signed / unsigned mismatch. It doesn't explain why we get the terrible byte-wise copies instead of at xmm copies, but at least that explains why we don't get memmove. Looks like memmove is only getting used for the different-type case starting with libstdc++ 15: https://cpp.godbolt.org/z/qzc5ehMcP |
This comment has been minimized.
This comment has been minimized.
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (0da05b9): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -0.9%, secondary -2.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary 3.2%, secondary -2.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary -0.3%, secondary -1.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 472.653s -> 476.124s (0.73%) |
Scheduled release date: Feb 24
1.94 becomes stable: Mar 5
Depends on: