Skip to content

Relink don't rebuild: add a baseline, sound implementation that can be incrementally improved#155871

Open
susitsm wants to merge 15 commits intorust-lang:mainfrom
susitsm:relink-dont-rebuild-base
Open

Relink don't rebuild: add a baseline, sound implementation that can be incrementally improved#155871
susitsm wants to merge 15 commits intorust-lang:mainfrom
susitsm:relink-dont-rebuild-base

Conversation

@susitsm
Copy link
Copy Markdown

@susitsm susitsm commented Apr 27, 2026

View all comments

This PR implemenets a sound but not too useful implementation of relink don't rebuild with the unstable -Z public-api-hash flag. It currently uses the stable hash of all items in the metadata.

The goal is to give a base implementation that can be used for experimentation. It can be incrementally improved over time by removing or stable sorting items. The PR also adds new test attributes rustc_public_hash_changed and rustc_public_hash_unchanged and an example test using them.

What are non-goals for this PR: a useful, optimized implementation of RDR and public api hashing. That has many more challenges which will each require careful review.

A non exhaustive list of the challenges for the feature I ran into while trying to make the hash useful (this should probably go in a tracking issue, but I'm not aware of one)

  • Name resolution diagnostics use private items. Importing the private my_crate::func prints private function as the error
  • Stripped cfg items are included in rmeta for improved diagnostics.
  • Which types to include? A simple is this public is not enough. RPIT returned types and the private types used in the included mir for generic/inline functions are probably needed.
  • It might make sense to have multiple hashes. Rmeta generation (cargo check), could use a much simpler hash than codegen, it does not need private types from mir (or any mir at all?)

@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Apr 27, 2026

Some changes occurred in compiler/rustc_attr_parsing

cc @jdonszelmann, @JonathanBrouwer

Some changes occurred in compiler/rustc_hir/src/attrs

cc @jdonszelmann, @JonathanBrouwer

Some changes occurred in compiler/rustc_passes/src/check_attr.rs

cc @jdonszelmann, @JonathanBrouwer

@rustbot rustbot added A-attributes Area: Attributes (`#[…]`, `#![…]`) S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 27, 2026
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Apr 27, 2026

r? @jdonszelmann

rustbot has assigned @jdonszelmann.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Why was this reviewer chosen?

The reviewer was selected based on:

  • Owners of files modified in this PR: compiler
  • compiler expanded to 73 candidates
  • Random selection from 20 candidates

@rust-log-analyzer

This comment has been minimized.

@susitsm susitsm force-pushed the relink-dont-rebuild-base branch from 0e63248 to 4b676bd Compare April 27, 2026 10:47
@jdonszelmann
Copy link
Copy Markdown
Contributor

This is a large change, might take me multiple review passes. I'll see if I can do the first today or tomorrow :)

Comment thread compiler/rustc_metadata/src/rmeta/mod.rs
Comment thread compiler/rustc_session/src/options.rs Outdated
Comment thread compiler/rustc_session/src/options/mitigation_coverage.rs Outdated
Comment thread compiler/rustc_session/src/options/mitigation_coverage.rs Outdated
Comment thread compiler/rustc_metadata/src/rmeta/encoder/public_api_hasher.rs Outdated
Comment thread compiler/rustc_metadata/src/rmeta/mod.rs Outdated
Comment thread compiler/rustc_middle/src/ich/hcx.rs Outdated
Comment thread compiler/rustc_metadata/src/rmeta/encoder/public_api_hasher.rs Outdated
Comment thread compiler/rustc_metadata/src/rmeta/encoder/public_api_hasher.rs Outdated
Comment thread compiler/rustc_metadata/src/rmeta/decoder/cstore_impl.rs
@jdonszelmann
Copy link
Copy Markdown
Contributor

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 28, 2026
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Apr 28, 2026

Reminder, once the PR becomes ready for a review, use @rustbot ready.

let mut public_api_hasher = PublicApiHasher::default();
let tcx = self.tcx;
let mut stats: Vec<(&'static str, usize)> = Vec::with_capacity(32);

Copy link
Copy Markdown
Member

@bjorn3 bjorn3 Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The encode_crate_deps below should record the public api hash rather than crate hash when doing rdr, right?

View changes since the review

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Short answer: you are right, that will improve it while keeping it sound. I will add that to this PR.

Long answer: one of the main goals of RDR is to enable early cutoff, including the public hash of all dependencies goes against that. We should only include public hashes of dependencies we reexport in some way, or better, only include the hash of the part we reexport. Where reexport here can mean pub use, inlinable/generic/const eval mir reachable through local inlinable/generic/const eval mir and some more stuff we are not aware of yet. Doing this correctly and maintainably is likely the single most technically challenging part of getting RDR right.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed it to use public_api_hash, added some comments. 044529d

@rust-bors

This comment has been minimized.

@susitsm susitsm force-pushed the relink-dont-rebuild-base branch from 4b676bd to 1682f38 Compare April 29, 2026 09:55
@susitsm susitsm force-pushed the relink-dont-rebuild-base branch from 1682f38 to ab0ab17 Compare April 29, 2026 10:12
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Apr 29, 2026

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

@susitsm
Copy link
Copy Markdown
Author

susitsm commented Apr 29, 2026

@rustbot review

@rustbot rustbot removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Apr 29, 2026
@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 29, 2026
@susitsm
Copy link
Copy Markdown
Author

susitsm commented Apr 29, 2026

Just realized that there is already a soundness hole: the crate_hash query itself. It won't be updated downstream if only the public hash changes. I'll have to review what exactly it is used for, maybe it is as simple as replacing it with the public hash. Except for proc macros, that might need some work

// should be added here as
// ```
// // FIXME do we need this // or a comment about why we need this
// let _ = my_new_field;
Copy link
Copy Markdown
Contributor

@jdonszelmann jdonszelmann Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this part is outdated now, about the let _ =

View changes since the review

// `hash_crate_root_public_api`" into `encode_my_new_field`
// 3. Only remove/change what is hashed in a separate PR. Removing items just from the hash
// should be done with extreme scrutiny. A better way might be to sort the query result
// in its provider, or filter which values we encode. That also helps with rmeta size.
Copy link
Copy Markdown
Contributor

@jdonszelmann jdonszelmann Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should maybe be a fixme at the end

View changes since the review

}
}

// When changed, make sure to update the hashing in `hash_crate_root_public_api`
Copy link
Copy Markdown
Contributor

@jdonszelmann jdonszelmann Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally, we of course made it so the hashing of that and the encoding here share some of the same source. Maybe by implementing a trait, or calling a method here that adds to the RDR hash, or returning a closure, idk. That way you don't have to think about all these comments and nonlocal subtleties

View changes since the review

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that, or even a rustc lint at some point that automatically prompts you if you've made a change here and didn't update the other file. not 100% sure yet what the logic there would be, but this, though better, still seems brittle

@jdonszelmann
Copy link
Copy Markdown
Contributor

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 29, 2026
Copy link
Copy Markdown
Contributor

@cezarbbb cezarbbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, I have read your code and I think it is very well written, but I have some questions I would like to confirm.

View changes since this review

template!(List: &[r#"cfg = "...", crate_name = "...""#]),
|this, cx, args| {
this.items.extend(parse_rdr_fields(cx, args).map(|(cfg, crate_name)| {
(cx.attr_span, RDRFields { cfg, crate_name, changed: false })
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I misunderstanding something: rustc_public_hash_changed but change = false? Then the code that checks let green = !fields.changed; seems to be reversing this.
In other words, although a change has occurred, green is still set to true?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I likely messed this up when rebasing to main and converting to attribute parsers. There is also something wrong with running the tests. cpass revisions don't run them, they need at least bpass. Likely because I based it on the CGU reuse attributes, and inserted the runner code there.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's also exactly what I wanted to say. It is possible that none of the assertions in your test triggered a fatal error, yet the test passed silently. It's necessary to double check your test:>

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed this. I also moved the attribute checking code near #[rustc_clean] checking and added a bfail revision to the test to make sure it fires.

template!(List: &[r#"cfg = "...", crate_name = "...""#]),
|this, cx, args| {
this.items.extend(parse_rdr_fields(cx, args).map(|(cfg, crate_name)| {
(cx.attr_span, RDRFields { cfg, crate_name, changed: true })
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

return;
}

let green = !fields.changed;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reversal happened here.

@rust-bors

This comment has been minimized.

@susitsm susitsm force-pushed the relink-dont-rebuild-base branch from ab0ab17 to 981dee4 Compare April 30, 2026 18:04
@susitsm
Copy link
Copy Markdown
Author

susitsm commented Apr 30, 2026

Did an analysis on the usage of crate_hash by renaming it and everything it touches. It is mostly replaceable by public api hash, but there is a needs_crate_hash function that needs cleanup. It mixes whether HIR hashing is needed or the full crate_hash query is needed. I'll have to open a PR for that before continuing here

jhpratt added a commit to jhpratt/rust that referenced this pull request May 4, 2026
…nkov

Clean up `TyCtxt::needs_crate_hash` usage and rename it to `needs_hir_hash`.

While reviewing `crate_hash` query usage for rust-lang#155871, the `needs_crate_hash` function turned out to be the cause of unnecessary calls to the query. The `needs_crate_hash` name is easy to mistake for the functionality of "needs the crate_hash query". This PR removes the usage of `needs_crate_hash` where it was not appropriate and renames the function to `needs_hir_hash` which better reflects its functionality.
JonathanBrouwer added a commit to JonathanBrouwer/rust that referenced this pull request May 4, 2026
…nkov

Clean up `TyCtxt::needs_crate_hash` usage and rename it to `needs_hir_hash`.

While reviewing `crate_hash` query usage for rust-lang#155871, the `needs_crate_hash` function turned out to be the cause of unnecessary calls to the query. The `needs_crate_hash` name is easy to mistake for the functionality of "needs the crate_hash query". This PR removes the usage of `needs_crate_hash` where it was not appropriate and renames the function to `needs_hir_hash` which better reflects its functionality.
JonathanBrouwer added a commit to JonathanBrouwer/rust that referenced this pull request May 4, 2026
…nkov

Clean up `TyCtxt::needs_crate_hash` usage and rename it to `needs_hir_hash`.

While reviewing `crate_hash` query usage for rust-lang#155871, the `needs_crate_hash` function turned out to be the cause of unnecessary calls to the query. The `needs_crate_hash` name is easy to mistake for the functionality of "needs the crate_hash query". This PR removes the usage of `needs_crate_hash` where it was not appropriate and renames the function to `needs_hir_hash` which better reflects its functionality.
JonathanBrouwer added a commit to JonathanBrouwer/rust that referenced this pull request May 4, 2026
…nkov

Clean up `TyCtxt::needs_crate_hash` usage and rename it to `needs_hir_hash`.

While reviewing `crate_hash` query usage for rust-lang#155871, the `needs_crate_hash` function turned out to be the cause of unnecessary calls to the query. The `needs_crate_hash` name is easy to mistake for the functionality of "needs the crate_hash query". This PR removes the usage of `needs_crate_hash` where it was not appropriate and renames the function to `needs_hir_hash` which better reflects its functionality.
@susitsm
Copy link
Copy Markdown
Author

susitsm commented May 4, 2026

After diving into how the crate hash is used during dependency loading, my conclusion is, with the current resolver, it is not possible to fully skip rustc invocations for libraries where only private parts of upstream dependencies changed. The resolver uses the crate hash of the dependencies saved inside the metadata to load transitive dependencies. Leaving something out of this hash is unsound.

I recommend reading the module level documentation of locator, but here is a short example showing the problem:

my_crate -> dep1 -> transitive_dep_with_private_hash1
         -> dep2 -> transitive_dep_with_private_hash2

The direct dependencies are located by name, but the transitive dependencies are located by name + crate hash of the transitive dependencies loaded from the metadata of dep1/dep2. This allows using different versions of crates by different dependencies. Here the hash is used to uniquely identify the crate, which needs private parts.

The solution(s)

Public api hash includes the stable crate id, it should be fine (for now)

StableCrateId collisions can happen, they are actively checked for during compilation. However, if we have one, rustc terminates. They don't happen in practice, that wouldn't make rustc too reliable. From the docs of StableCrateId:

For a big crate graph with 1000 crates in it, there is
a probability of 1 in 36,890,000,000,000 of a StableCrateId collision.

so it should be fine while this is not stabilized, the amount of miscompilations should be zero. Ofc, this is only true with battle tested build systems like cargo, which use -Cmetadata properly.

Save paths + public hash instead of full hashes

We save the path of where to look for a dependency and use that with the public hash to look up dependencies. This could improve resolver speed, while sacrificing some portability of rmetas (not sure of the real world use case where this would cause a problem.) This would also allow users to replace a dependency with a different one that has the same public api hash, then just relink. Which kind of clicks with the spirit of relink-dont-rebuild.

Add some kind of fast path that only updates the crate and dependency hashes when public hash did not change

The upside to this is that we don't have to touch cargo to make the RDR feature good (would still be an improvement, but a lot less important.) All other tries for this feature included changes to cargo, to skip rustc invocations. This would need to be handled in all alternate build systems as well. If we add this fast path, there is no need to do that. Downside is that I have no idea how hard this is, but seems complicated.

I'm leaning towards save paths + public hash instead of full hashes. Any input is appreciated here.

JonathanBrouwer added a commit to JonathanBrouwer/rust that referenced this pull request May 4, 2026
…nkov

Clean up `TyCtxt::needs_crate_hash` usage and rename it to `needs_hir_hash`.

While reviewing `crate_hash` query usage for rust-lang#155871, the `needs_crate_hash` function turned out to be the cause of unnecessary calls to the query. The `needs_crate_hash` name is easy to mistake for the functionality of "needs the crate_hash query". This PR removes the usage of `needs_crate_hash` where it was not appropriate and renames the function to `needs_hir_hash` which better reflects its functionality.
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request May 5, 2026
…nkov

Clean up `TyCtxt::needs_crate_hash` usage and rename it to `needs_hir_hash`.

While reviewing `crate_hash` query usage for rust-lang#155871, the `needs_crate_hash` function turned out to be the cause of unnecessary calls to the query. The `needs_crate_hash` name is easy to mistake for the functionality of "needs the crate_hash query". This PR removes the usage of `needs_crate_hash` where it was not appropriate and renames the function to `needs_hir_hash` which better reflects its functionality.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-attributes Area: Attributes (`#[…]`, `#![…]`) S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants