Skip to content

Conversation

@jhpratt
Copy link
Member

@jhpratt jhpratt commented Jan 26, 2026

Successful merges:

r? @ghost

Create a similar rollup

Per my understanding, needed for mut access next line.
Use explicit SSE2 intrinsics to avoid LLVM's broken AVX-512
auto-vectorization which generates ~31 kshiftrd instructions.

Performance
- AVX-512: 34-48x faster
- SSE2: 1.5-2x faster

Improves on earlier pr
The SSE2 helper function is not inlined across crate boundaries,
so we cannot verify the codegen in an assembly test. The fix is
still verified by the absence of performance regression.
refactor rustc-hash integration

I found that rustc-hash is used in multiple compiler crates. Also some types use `FxBuildHasher` whereas others use `BuildHasherDefault<FxHasher>` (both do the same thing).

In order to simplify future hashing experiments, I changed every location to use `rustc_data_structures::fx::*` types instead, and also removed the `BuildHasherDefault` variant. This will simplify future experiments with hashing (for example trying out a hasher that doesn't implement `Default` for whatever reason).
…erformance, r=folkertdev

Improve is_ascii performance on x86_64 with explicit SSE2 intrinsics

# Summary

Improves `slice::is_ascii` performance for SSE2 target roughly 1.5-2x on larger inputs.
AVX-512 keeps similiar performance characteristics.

This is building on the work already merged in rust-lang#151259.
In particular this PR improves the default SSE2 performance, I don't consider this a temporary fix anymore.
Thanks to @folkertdev for pointing me to consider `as_chunk` again.

# The implementation:
- Uses 64-byte chunks with 4x 16-byte SSE2 loads OR'd together
- Extracts the MSB mask with a single `pmovmskb` instruction
- Falls back to usize-at-a-time SWAR for inputs < 64 bytes

# Performance impact (vs before rust-lang#151259):
- AVX-512: 34-48x faster
- SSE2: 1.5-2x faster

  <details>
  <summary>Benchmark Results (click to expand)</summary>

  Benchmarked on AMD Ryzen 9 9950X (AVX-512 capable). Values show relative performance (1.00 = fastest).
  Tops out at 139GB/s for large inputs.

  ### early_non_ascii

  | Input Size | new_avx512 | new_sse2 | old_avx512 | old_sse2 |
  |------------|------------|----------|------------|----------|
  | 64 | 1.01 | **1.00** | 13.45 | 1.13 |
  | 1024 | 1.01 | **1.00** | 13.53 | 1.14 |
  | 65536 | 1.01 | **1.00** | 13.99 | 1.12 |
  | 1048576 | 1.02 | **1.00** | 13.29 | 1.12 |

  ### late_non_ascii

  | Input Size | new_avx512 | new_sse2 | old_avx512 | old_sse2 |
  |------------|------------|----------|------------|----------|
  | 64 | **1.00** | 1.01 | 13.37 | 1.13 |
  | 1024 | 1.10 | **1.00** | 42.42 | 1.95 |
  | 65536 | **1.00** | 1.06 | 42.22 | 1.73 |
  | 1048576 | **1.00** | 1.03 | 34.73 | 1.46 |

  ### pure_ascii

  | Input Size | new_avx512 | new_sse2 | old_avx512 | old_sse2 |
  |------------|------------|----------|------------|----------|
  | 4 | 1.03 | **1.00** | 1.75 | 1.32 |
  | 8 | **1.00** | 1.14 | 3.89 | 2.06 |
  | 16 | **1.00** | 1.04 | 1.13 | 1.62 |
  | 32 | 1.07 | 1.19 | 5.11 | **1.00** |
  | 64 | **1.00** | 1.13 | 13.32 | 1.57 |
  | 128 | **1.00** | 1.01 | 19.97 | 1.55 |
  | 256 | **1.00** | 1.02 | 27.77 | 1.61 |
  | 1024 | **1.00** | 1.02 | 41.34 | 1.84 |
  | 4096 | 1.02 | **1.00** | 45.61 | 1.98 |
  | 16384 | 1.01 | **1.00** | 48.67 | 2.04 |
  | 65536 | **1.00** | 1.03 | 43.86 | 1.77 |
  | 262144 | **1.00** | 1.06 | 41.44 | 1.79 |
  | 1048576 | 1.02 | **1.00** | 35.36 | 1.44 |

  </details>

## Reproduction / Test Projects

Standalone validation tools: https://github.com/bonega/is-ascii-fix-validation

- `bench/` - Criterion benchmarks for SSE2 vs AVX-512 comparison
- `fuzz/` - Compares old/new implementations with libfuzzer

Relates to: llvm/llvm-project#176906
…r=joboet

Add missing mut to pin.rs docs

Per my understanding, needed for mut access next line.
Fix broken WASIp1 reference link

### Location (URL)
https://doc.rust-lang.org/rustc/platform-support/wasm32-wasip1.html

<img width="800" alt="image" src="https://github.com/user-attachments/assets/b9402b3a-db7b-405f-b4ef-d849c03ad893" />

### Summary
The WASIp1 reference link in the `wasm32-wasip1` platform documentation currently points to a path that no longer exists in the WASI repository.

The WASI project recently migrated the WASI 0.1 (preview1) documentation from the `legacy/preview1` directory to the dedicated `wasi-0.1` branch (WebAssembly/WASI#855).

This updates the link to point to the intended historical WASIp1 reference, which matches the documented intent of the `wasm32-wasip1` target.
@rust-bors rust-bors bot added the rollup A PR which is a rollup label Jan 26, 2026
@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jan 26, 2026
@jhpratt
Copy link
Member Author

jhpratt commented Jan 26, 2026

@bors r+ rollup=never p=4

@rust-bors
Copy link
Contributor

rust-bors bot commented Jan 26, 2026

📌 Commit efe56a5 has been approved by jhpratt

It is now in the queue for this repository.

@rust-bors rust-bors bot added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 26, 2026
@rust-bors

This comment has been minimized.

rust-bors bot pushed a commit that referenced this pull request Jan 26, 2026
Rollup of 4 pull requests

Successful merges:

 - #150353 (refactor rustc-hash integration)
 - #151611 (Improve is_ascii performance on x86_64 with explicit SSE2 intrinsics)
 - #150705 (Add missing mut to pin.rs docs)
 - #151639 (Fix broken WASIp1 reference link)
@rust-log-analyzer
Copy link
Collaborator

The job i686-gnu-1 failed! Check out the build log: (web) (plain enhanced) (plain)

Click to see the possible cause of the failure (guessed by this bot)
##[group]Runner Image Provisioner
Hosted Compute Agent
Version: 20260115.477
Commit: 4b342d620503cbe250a3154040964899ea7c9b00
Build Date: 2026-01-15T22:32:41Z
Worker ID: {c7ac181e-e7d5-4b20-abe7-8c4a908c2b11}
Azure Region: westcentralus
##[endgroup]
##[group]Operating System
Ubuntu
24.04.3
LTS
---
REPOSITORY                                   TAG       IMAGE ID       CREATED      SIZE
ghcr.io/dependabot/dependabot-updater-core   latest    354d02aa29ac   6 days ago   783MB
=> Removing docker images...
Deleted Images:
untagged: ghcr.io/dependabot/dependabot-updater-core:latest
untagged: ghcr.io/dependabot/dependabot-updater-core@sha256:596da3f22bcbdff2c96fd7126001278022c834c1621c5efa2ad1a7794590636c
deleted: sha256:354d02aa29acf525570c732b6e006ecf138de6d63ca525d552eb4b24880ddc6c
deleted: sha256:8b7af0e426bc2cbeeacfd96b8354d3b80016991520977197e62090e47abaede8
deleted: sha256:cadf11ef1de7fdd5eab563757942353684047f09b212dc99d6ed48e8acf34d62
deleted: sha256:569b0caf9d5285db44ccd2629a3470139eea755be423a33a54d8a24cb3926bfa
deleted: sha256:f9dc5feb048d8f9fd43137e3998f59e9acfbd76c47a4e14984d109654119e282
---
test [ui] tests/ui/extern/issue-64655-extern-rust-must-allow-unwind.rs#fat2 ... ok
test [ui] tests/ui/extern/issue-64655-extern-rust-must-allow-unwind.rs#fat3 ... ok
test [ui] tests/ui/extern/issue-80074.rs ... ok
test [ui] tests/ui/extern/issue-95829.rs ... ok
test [ui] tests/ui/extern/lgamma-linkage.rs ... ok
test [ui] tests/ui/extern/no-mangle-associated-fn.rs ... ok
test [ui] tests/ui/extern/not-in-block.rs ... ok
test [ui] tests/ui/extern/unsized-extern-derefmove.rs ... ok
test [ui] tests/ui/extern/windows-tcb-trash-13259.rs ... ok
test [ui] tests/ui/feature-gates/allow-features-empty.rs ... ok
---
test [ui] tests/ui/imports/ambiguous-2.rs ... ok
test [ui] tests/ui/imports/ambiguous-4.rs ... ok
test [ui] tests/ui/imports/ambiguous-7.rs ... ok
test [ui] tests/ui/imports/ambiguous-9.rs ... ok
test [ui] tests/ui/imports/ambiguous-panic-glob-vs-multiouter.rs ... ok
test [ui] tests/ui/imports/ambiguous-panic-globvsglob.rs ... ok
test [ui] tests/ui/imports/ambiguous-panic-no-implicit-prelude.rs ... ok
test [ui] tests/ui/imports/ambiguous-8.rs ... ok
test [ui] tests/ui/imports/ambiguous-glob-vs-expanded-extern.rs ... ok
test [ui] tests/ui/imports/ambiguous-panic-non-prelude-core-glob.rs ... ok
test [ui] tests/ui/imports/ambiguous-panic-non-prelude-std-glob.rs ... ok
---

---- [ui] tests/ui/process/process-panic-after-fork.rs stdout ----

error: test compilation failed although it shouldn't!
status: signal: 6 (SIGABRT) (core dumped)
command: env -u RUSTC_LOG_COLOR RUSTC_ICE="0" RUST_BACKTRACE="short" "/checkout/obj/build/i686-unknown-linux-gnu/stage2/bin/rustc" "/checkout/tests/ui/process/process-panic-after-fork.rs" "-Zthreads=1" "-Zsimulate-remapped-rust-src-base=/rustc/FAKE_PREFIX" "-Ztranslate-remapped-path-to-local-path=no" "-Z" "ignore-directory-in-diagnostics-source-blocks=/cargo" "-Z" "ignore-directory-in-diagnostics-source-blocks=/checkout/vendor" "--sysroot" "/checkout/obj/build/i686-unknown-linux-gnu/stage2" "--target=i686-unknown-linux-gnu" "--check-cfg" "cfg(test,FALSE)" "-O" "--error-format" "json" "--json" "future-incompat" "-Ccodegen-units=1" "-Zui-testing" "-Zdeduplicate-diagnostics=no" "-Zwrite-long-types-to-disk=no" "-Cstrip=debuginfo" "-o" "/checkout/obj/build/i686-unknown-linux-gnu/test/ui/process/process-panic-after-fork/a" "-A" "internal_features" "-A" "unused_parens" "-A" "unused_braces" "-Crpath" "-Cdebuginfo=0" "-Lnative=/checkout/obj/build/i686-unknown-linux-gnu/native/rust-test-helpers"
stdout: none
--- stderr -------------------------------
rustc: /checkout/src/llvm-project/llvm/include/llvm/ADT/DenseMap.h:696: bool llvm::DenseMapBase<DerivedT, KeyT, ValueT, KeyInfoT, BucketT>::LookupBucketFor(const LookupKeyT&, BucketT*&) [with LookupKeyT = unsigned int; DerivedT = llvm::SmallDenseMap<unsigned int, unsigned int, 2, llvm::DenseMapInfo<unsigned int>, llvm::detail::DenseMapPair<unsigned int, unsigned int> >; KeyT = unsigned int; ValueT = unsigned int; KeyInfoT = llvm::DenseMapInfo<unsigned int>; BucketT = llvm::detail::DenseMapPair<unsigned int, unsigned int>]: Assertion `!KeyInfoT::isEqual(Val, EmptyKey) && !KeyInfoT::isEqual(Val, TombstoneKey) && "Empty/Tombstone value shouldn't be inserted into map!"' failed.
------------------------------------------

---- [ui] tests/ui/process/process-panic-after-fork.rs stdout end ----

failures:

@rust-bors rust-bors bot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jan 26, 2026
@rust-bors
Copy link
Contributor

rust-bors bot commented Jan 26, 2026

💔 Test for b614256 failed: CI. Failed job:

@jieyouxu
Copy link
Member

rustc: /checkout/src/llvm-project/llvm/include/llvm/ADT/DenseMap.h:696: bool llvm::DenseMapBase<DerivedT, KeyT, ValueT, KeyInfoT, BucketT>::LookupBucketFor(const LookupKeyT&, BucketT*&) [with LookupKeyT = unsigned int; DerivedT = llvm::SmallDenseMap<unsigned int, unsigned int, 2, llvm::DenseMapInfo, llvm::detail::DenseMapPair<unsigned int, unsigned int> >; KeyT = unsigned int; ValueT = unsigned int; KeyInfoT = llvm::DenseMapInfo; BucketT = llvm::detail::DenseMapPair<unsigned int, unsigned int>]: Assertion `!KeyInfoT::isEqual(Val, EmptyKey) && !KeyInfoT::isEqual(Val, TombstoneKey) && "Empty/Tombstone value shouldn't be inserted into map!"' failed.

Surprising, I only seen this on x86_64 macos before, I think?

@jhpratt
Copy link
Member Author

jhpratt commented Jan 26, 2026

No immediately obvious cause, and this is partially being "try"-d in #151667. Deferring to that.

@jhpratt jhpratt closed this Jan 26, 2026
@rustbot rustbot removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 26, 2026
@jhpratt jhpratt deleted the rollup-scDdbzg branch January 26, 2026 04:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rollup A PR which is a rollup T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants