-
-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New MIR Pass: SsaRangePropagation #150309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@bors try @rust-timer queue (I want to test if try builds work and this PR seems like a good fit for a perf. run :) ) |
This comment has been minimized.
This comment has been minimized.
[EXPERIMENT] New MIR Pass: SsaRangePropagation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (d9c5f75): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -0.9%, secondary 1.7%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary 1.8%, secondary -1.6%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary -0.0%, secondary -0.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 482.582s -> 484.115s (0.32%) |
|
@bors try parent=99ff3fbb86658b427f5dd7daaae8db5626a63c26 @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
[EXPERIMENT] New MIR Pass: SsaRangePropagation
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (edacb2e): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -0.2%, secondary 1.6%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary 1.8%, secondary -1.3%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary -0.0%, secondary -0.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 482.582s -> 482.879s (0.06%) |
This comment has been minimized.
This comment has been minimized.
|
The pass is fast, and it reduced MIR body definitely. All regressions are from opt, and there are probably more inlining. |
|
Some changes occurred to MIR optimizations cc @rust-lang/wg-mir-opt |
|
r? @nnethercote rustbot has assigned @nnethercote. Use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for coding this! I don't have many comments on the design, this pass is very clear. I mostly have nits/suggestions on the implementation.
| if let Err(Some(place)) = self.simplify_operand(cond, location) { | ||
| let successor = Location { block: *target, statement_index: 0 }; | ||
| if location.block != successor.block | ||
| && self.unique_predecessors.contains(successor.block) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why check location.block != successor.block? We know that location.block has at least one predecessor which is reachable from bb0, so which is not itself. So if successor.block has a unique predecessor, it must be different from location.block, mustn't it?
I'm tempted to turn the location.block != successor.block into an assertion, or do you have a test to exhibit this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, and I can change to location.dominates(successor, self.dominators) for assert.
| } | ||
| let successor = Location { block: target, statement_index: 0 }; | ||
| if location.block != successor.block | ||
| && self.unique_predecessors.contains(successor.block) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, the unique_predecessors check should be equivalent to location.dominates(successor, self.dominators), isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider the test case on_match_2 and on_if_2. They are not equivalent for the SwitchInt terminator.
| } | ||
|
|
||
| let otherwise = Location { block: targets.otherwise(), statement_index: 0 }; | ||
| if place.ty(self.local_decls, self.tcx).ty.is_bool() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mind leaving a FIXME to extend this to other types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have left a FIXME. Those FIXMEs can be fixed when it shows beneficial.
| //! let b = a < 9; | ||
| //! if b { | ||
| //! let c = b; // c is true since b is within the range [1, 2) | ||
| //! let d = a < 8; // d is true since b within the range [0, 9) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand these comments. b is boolean, how can it be within a numeric range? And where does the [1, 2) come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the integer representation in MIR, so the full range of boolean is [0, 2).
|
Requested reviewer is already assigned to this pull request. Please choose another assignee. |
| // FIXME: This should use the intersection of all valid ranges. | ||
| let (_, range) = | ||
| ranges.iter().find(|(range_loc, _)| range_loc.dominates(location, &self.dominators))?; | ||
| Some(*range) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't have a better range, should we fallback to the type-based range? For discriminants in particular?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I will reland #148443 as a subsequent PR.
| continue; | ||
| } | ||
| let successor = Location { block: target, statement_index: 0 }; | ||
| if self.unique_predecessors.contains(successor.block) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use location.dominates(successor, &self.dominators) here too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. SwitchInt has multiple targets. For instance, bb0 dominates bb2, but the match value can be 1 or 2 in bb2. on_if_2 show this case.
graph TD;
bb0(["bb0: match(1:bb1, 2:bb2)"])-->|1|bb1;
bb0-->|2|bb2;
bb1-->bb2;
There is another case. The test case is on_match_2. bb0 dominates bb1, but the match value can be 1 or 2.
graph TD;
bb0(["bb0: match(1:bb1, 2:bb1)"])-->|1|bb1;
bb0-->|2|bb1;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah too bad, I would have loved to get rid of unique_predecessors...
|
Thanks! |
This comment has been minimized.
This comment has been minimized.
What is this?This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.Comparing 9b37157 (parent) -> 3d087e6 (this PR) Test differencesShow 8 test diffsStage 1
Stage 2
Additionally, 6 doctest diffs were found. These are ignored, as they are noisy. Job group index
Test dashboardRun cargo run --manifest-path src/ci/citool/Cargo.toml -- \
test-dashboard 3d087e6044bddc65723bf42c76fe4cc33a0076b0 --output-dir test-dashboardAnd then open Job duration changes
How to interpret the job duration changes?Job durations can vary a lot, based on the actual runner instance |
|
Finished benchmarking commit (3d087e6): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowOur benchmarks found a performance regression caused by this PR. Next Steps:
@rustbot label: +perf-regression Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -1.7%, secondary 0.7%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeResults (primary -0.1%, secondary -0.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 474.437s -> 473.646s (-0.17%) |
|
perf triage: pre-merge results roughly match the final results. Based on the assesment from #150309 (comment), I assume some small regressions in @rustbot label: +perf-regression-triaged Note that I can't asses very well whether this is justified or not as I don't have that much context. The original motivation for mir-opts was improving compile time, which is what I'm triaging here. From that standpoint this is now negative and it's only acceptable if it later leads to more improvements. As far as I can tell, the motivation here is to improve runtime performance, for which we don't have much coverage in the benchmarks, so from that standpoint it's also not obvious to me whether this is worth the cost. Somebody with more knowledge than me should make that judgement. |
|
At least for now, the pass probably cannot improve runtime performance because all things should have been done in LLVM. IMO, smaller MIR may be improvements or regressions. For instance, inlining :( or some new opportunities that have more constants are found in LLVM. The regressions are smaller. I don't think it's too bad. |
As an alternative to #150192.
Introduces a new pass that propagates the known ranges of SSA locals.
We can know the ranges of SSA locals at some locations for the following code:
This PR only implements a trivial range: we know one value on switch, assert, and assume.