obligations_for_self_ty: skip irrelevant goals#146759
obligations_for_self_ty: skip irrelevant goals#146759lcnr wants to merge 2 commits intorust-lang:mainfrom
Conversation
|
Some changes occurred to the core trait solver cc @rust-lang/initiative-trait-system-refactor changes to |
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
obligations_for_self_ty: skip irrelevant goals
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
obligations_for_self_ty: skip irrelevant goals
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (ba651ad): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)This benchmark run did not return any relevant results for this metric. CyclesResults (secondary -3.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (secondary 0.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 470.95s -> 473.304s (0.50%) |
| let sub_root_var = self.sub_unification_table_root_var(self_ty); | ||
| let obligations = self | ||
| .fulfillment_cx | ||
| .borrow() | ||
| .pending_obligations_potentially_referencing_sub_root(sub_root_var); |
|
Is this like, important, or is it "just" a small perf win? |
452fdbf to
9926fa3
Compare
|
@rustbot ready see pr descr |
There was a problem hiding this comment.
I feel relatively uneasy about this. I don't like that this optimization makes obligations_for_self_ty wrong if try_evaluate_obligations hasn't been previously called in order to update all the stalled_on vars.
If this were only a theoretical issue it'd still be pretty bad since I don't think we'd be able to expect it to not happen in practice in the long term. In practice it should be possible for unsize coercion to hit this with its custom obligation evaluation loop and inability to call try_evaluate_goals.
I think the only way I could approve this if I thought it was ok to have obligations_for_self_ty have incorrect behaviour in some edge cases and I think that is not the case.
I'm not sure what the right fix is here given that unsize coercion does actually hit this in theory, so asserting that we're always calling obligations_for_self_ty in "good" cases doesn't actually work ^^'
It is totally okay for |
|
☔ The latest upstream changes (presumably #155953) made this pull request unmergeable. Please resolve the merge conflicts. |
|
I ran into the same hotspot while profiling Would it address the issue if The added cost is a small visit at registration time, which seems cheap Sharing as a possible angle to unblock — happy to defer on whether it's |
|
Went and built the eager-init variant on top of this PR to make the discussion more concrete. Sharing two ways to look at it — feel free to ignore if you'd rather keep this PR focused. Branch: Just the eager-init commit if you want to cherry-pick directly onto your branch: Inline diff (the only delta vs this PR): type PendingObligations<'tcx> = ThinVec<(
PredicateObligation<'tcx>,
Option<GoalStalledOn<TyCtxt<'tcx>>>,
+ // Initial sub_roots, captured at register time. Used when stalled_on
+ // is None (i.e. before the obligation has been evaluated).
+ ThinVec<TyVid>,
)>;
+fn collect_initial_sub_roots<'tcx>(
+ infcx: &InferCtxt<'tcx>,
+ predicate: ty::Predicate<'tcx>,
+) -> ThinVec<TyVid> {
+ // walk predicate; for each ty::TyVar, push sub_unification_table_root_var
+}
impl<'tcx> ObligationStorage<'tcx> {
- fn register(&mut self, obligation: ..., stalled_on: ...) {
- self.pending.push((obligation, stalled_on));
+ fn register(&mut self, infcx: &InferCtxt<'tcx>, obligation: ..., stalled_on: ...) {
+ let initial_sub_roots = collect_initial_sub_roots(infcx, obligation.predicate);
+ self.pending.push((obligation, stalled_on, initial_sub_roots));
}
fn clone_pending_potentially_referencing_sub_root(&self, vid: TyVid) -> ... {
- .filter(|(_, stalled_on)| {
- if let Some(stalled_on) = stalled_on {
- stalled_on.sub_roots.iter().any(|&r| r == vid)
- } else {
- true
- }
+ .filter(|(_, stalled_on, initial_sub_roots)| {
+ let sub_roots = if let Some(stalled_on) = stalled_on {
+ &stalled_on.sub_roots[..]
+ } else {
+ &initial_sub_roots[..]
+ };
+ sub_roots.iter().any(|&r| r == vid)
})
}
}Local test status (stage1, aarch64-apple-darwin):
Caveats I'm aware of:
No perf numbers from me — wg-grammar would be the natural target but I don't have the setup. Happy to fold this into your PR (you keep authorship of the original two commits + add my one on top), land it as a follow-up after this merges, or just close the loop here. Whatever's least friction for you. |
|
I've been thinking of this PR again as you've done the change to if If I think the rest of this PR is just good to go, would be up to take this over and open it as a new PR 🤔 |
|
Thank you very much! Let me take over anything if you need! Sorry of my bad English, but can I ask my understanding? plus: should I wait for merging #156172, or make a PR depend on it? |
That PR only introduces a permanently unstable compiler flag, so it's probably only relevant if you're planning to add tests with something like |
|
@ShoyuVanilla Thank you very much! It's very clear! |
|
closing in favor of #156187 |
View all comments
Reduces the compile time of
wg-grammarfrom more than 70s to about 40s. So a >30% perf improvement for that crate.r? @BoxyUwU @compiler-errors