Configure the compiler ProgramDb once per project by sheaf · Pull Request #11768 · haskell/cabal

sheaf · 2026-04-27T17:36:07Z

This PR ensures we configure the compiler program database (constituted of ghc and attendant programs such as ghc-pkg, haddock, toolchain programs such as ar, ld etc) ahead of time within cabal-install, so that we don't need to reconfigure programs such as ghc, ghc-pkg, haddock etc once per package within Cabal.

This should be a net performance win without any other change in behaviour.

Template Α: This PR modifies behaviour or interface

Patches conform to the coding conventions.
Any changes that could be relevant to users have been recorded in the changelog.
The documentation has been updated, if necessary.
No manual QA notes necessary (no change to the command line interface).
Testing approach: the subtle tests like BuildToolPaths ensure we don't regress when users pass --with-alex (for example).

sheaf · 2026-04-29T16:16:34Z

This is a thorny area, see e.g. #10692 (commits 24f8395 and 2c19bf3) and #11373 (040a97d). So I would appreciate a careful review.

sheaf · 2026-04-30T13:04:08Z

I have put up a spreadsheet here, compiling pandoc from scratch and using the --build-timings feature from #11769.

Base average time spent in configure: 2.9 seconds.
With this patch: it goes down to 1.2 seconds.
With #11767: it goes down to 1.7 seconds.
With both: it goes down to 0.15 seconds.

Mikolaj · 2026-04-30T18:02:01Z

This really looks tricky --- so few changes, but such a great effect.

Would it possible to add some tests that try to break things by performing a few scenarios that previously took many configuration invocations and now that just one? Maybe to these twice for a good measure?

A question: does the "once per project" mean once per project per cabal run, or until cabal clean happens or something else invalidates things?

sol · 2026-05-03T21:20:04Z

@sheaf I think when I first saw this I misinterpreted your numbers. Are you saying that if you have 10 packages on the critical path that these two PRs will speed up your build by almost 30 seconds?

sheaf · 2026-05-04T08:31:57Z

@sheaf I think when I first saw this I misinterpreted your numbers. Are you saying that if you have 10 packages on the critical path that these two PRs will speed up your build by almost 30 seconds?

In practice the impact is lesser because one is usually configuring a package while one is building another package, so running configure is very rarely on the critical path. For example, in one of my tests, building pandoc took 10m30s instead of 11m, even though the total time saved in configure is on the order of 7 minutes.

sheaf · 2026-05-04T08:35:44Z

Would it possible to add some tests that try to break things by performing a few scenarios that previously took many configuration invocations and now that just one? Maybe to these twice for a good measure?

No I don't think doing random scenarios is really what it takes to test this. It takes a very carefully constructed test case to be able to trigger issues here, and that is precisely what Matthew did when he added tests such as ExtraProgPathLocal and BuildToolDependsExternal. I don't have anything else I can think of testing.

A question: does the "once per project" mean once per project per cabal run, or until cabal clean happens or something else invalidates things?

The (unchanged) recompilation logic ensures that the result of configuring the compiler is cached, so it would be until a cabal clean runs or something else invalidates the cache.

sol · 2026-05-04T13:53:57Z

Having cabal doctest broken due to #11373 it's somewhat hard for me to muster excitement for this change.

For me personally, a net speed up of 5% is not impactful enough to buy into a more stateful design.

The only situation where I'm personally eager to do any sort of caching is if I can key the cache entries over their inputs (content addressable), so that you don't have to think about cache invalidation. This often also means that you can ignore the cache when reasoning about your code in general.

That said, I haven't looked at the code at any length.

In terms of priorities, again personally, I would much prefer to have a release that addresses existing regressions like #11373.

As a downstream consumer it just does not feel nice having your software broken by every other Cabal or GHC release.

(It also doesn't help my excitement that the other day I was spending an hour debugging a test suite order dependency caused exactly by GHC caching something in a global "variable".)

sheaf · 2026-05-04T19:55:05Z

Having cabal doctest broken due to #11373 it's somewhat hard for me to muster excitement for this change.

For me personally, a net speed up of 5% is not impactful enough to buy into a more stateful design.

I'm coming at this from an architectural perspective: when cabal-install orchestrates a build plan, it has decided ahead of time on the compiler and toolchain that is going to be used for the project. More abstractly, while the package author supplies the contents to build and the way to build the package, it is not the package author that controls what the package is being built with, it is the user of the package, by way of the build system they are using (in this case cabal-install).

I'm only really interested in the benchmarking numbers insofar as they validate the approach: there is a lot of work being done repeatedly per-package, and all this work does is rediscover information that was already known ahead of time. This is the one of the central observations of Haskell Tech Proposal #60, and its implementation (finalised in 6867dd5) is what allows us to redesign how cabal-install works to achieve a more modular design.

In terms of priorities, again personally, I would much prefer to have a release that addresses existing regressions like #11373.

I'm not involved in release management, so whatever work I am doing here has little bearing on releases. Note as well that I worked with Matthew at the time to implement the fix for #11373, and am also committed to fixing any bugs that would arise from this PR. I'm more than happy to smoke test the change against any packages you are concerned might be affected to avoid any possibility of breakage.

sheaf · 2026-05-07T12:53:32Z

I would like to rectify my previous comment that the configure phase is rarely on the critical path. In fact, I think the opposite is true: as other parts of the build are paralellisable (in particular with -jsem allowing intra-package parallelism), the bottleneck will become the configure steps, which are completely serial. So, as per Amdahl's law, I think speeding up configure has the potential to have quite significant benefits when compiling with -jsem.

I have done additional benchmarking, this time building aeson with -j1, and observe a 40s time save on average (a 3m45s build goes down to 3m5s), which is a 17% reduction. Admittedly this is the best possible benchmark for this change, but given the argument above I think it shows how much we stand to gain.

sheaf mentioned this pull request Apr 27, 2026

cabal-install: add --build-timings flag #11769

Open

5 tasks

sheaf force-pushed the conf-comp-progs branch 4 times, most recently from 7cfb145 to 83ef876 Compare April 29, 2026 12:03

sheaf marked this pull request as ready for review April 29, 2026 12:04

sheaf added the attention: needs-review label Apr 29, 2026

sheaf mentioned this pull request Apr 29, 2026

Keep a running InstalledPackageIndex to skip expensive per-package ghc-pkg calls #11767

Draft

5 tasks

Mikolaj added the intricate potentially very hard to review, but worth it label Apr 30, 2026

philderbeast reviewed May 2, 2026

View reviewed changes

Comment thread cabal-install/src/Distribution/Client/ProjectPlanning.hs

philderbeast reviewed May 2, 2026

View reviewed changes

Comment thread cabal-install/src/Distribution/Client/ProjectPlanning.hs Outdated

philderbeast reviewed May 2, 2026

View reviewed changes

Comment thread cabal-install/src/Distribution/Client/ProjectPlanning.hs Outdated

philderbeast reviewed May 2, 2026

View reviewed changes

Comment thread cabal-install/src/Distribution/Client/SetupWrapper.hs

philderbeast approved these changes May 2, 2026

View reviewed changes

Configure the compiler ProgramDb once per project

aa71936

sheaf force-pushed the conf-comp-progs branch from 83ef876 to aa71936 Compare May 4, 2026 09:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configure the compiler ProgramDb once per project#11768

Configure the compiler ProgramDb once per project#11768
sheaf wants to merge 1 commit intohaskell:masterfrom
sheaf:conf-comp-progs

sheaf commented Apr 27, 2026 •

edited

Loading

Uh oh!

sheaf commented Apr 29, 2026

Uh oh!

sheaf commented Apr 30, 2026

Uh oh!

Mikolaj commented Apr 30, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sol commented May 3, 2026 •

edited

Loading

Uh oh!

sheaf commented May 4, 2026

Uh oh!

sheaf commented May 4, 2026

Uh oh!

sol commented May 4, 2026

Uh oh!

sheaf commented May 4, 2026 •

edited

Loading

Uh oh!

sheaf commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

sheaf commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sheaf commented Apr 29, 2026

Uh oh!

sheaf commented Apr 30, 2026

Uh oh!

Mikolaj commented Apr 30, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sol commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sheaf commented May 4, 2026

Uh oh!

sheaf commented May 4, 2026

Uh oh!

sol commented May 4, 2026

Uh oh!

sheaf commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sheaf commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sheaf commented Apr 27, 2026 •

edited

Loading

sol commented May 3, 2026 •

edited

Loading

sheaf commented May 4, 2026 •

edited

Loading