Configure the compiler ProgramDb once per project#11768
Configure the compiler ProgramDb once per project#11768sheaf wants to merge 1 commit intohaskell:masterfrom
Conversation
7cfb145 to
83ef876
Compare
|
This really looks tricky --- so few changes, but such a great effect. Would it possible to add some tests that try to break things by performing a few scenarios that previously took many configuration invocations and now that just one? Maybe to these twice for a good measure? A question: does the "once per project" mean once per project per cabal run, or until |
|
@sheaf I think when I first saw this I misinterpreted your numbers. Are you saying that if you have 10 packages on the critical path that these two PRs will speed up your build by almost 30 seconds? |
In practice the impact is lesser because one is usually configuring a package while one is building another package, so running |
No I don't think doing random scenarios is really what it takes to test this. It takes a very carefully constructed test case to be able to trigger issues here, and that is precisely what Matthew did when he added tests such as
The (unchanged) recompilation logic ensures that the result of configuring the compiler is cached, so it would be until a |
|
Having For me personally, a net speed up of 5% is not impactful enough to buy into a more stateful design. The only situation where I'm personally eager to do any sort of caching is if I can key the cache entries over their inputs (content addressable), so that you don't have to think about cache invalidation. This often also means that you can ignore the cache when reasoning about your code in general. That said, I haven't looked at the code at any length. In terms of priorities, again personally, I would much prefer to have a release that addresses existing regressions like #11373. As a downstream consumer it just does not feel nice having your software broken by every other Cabal or GHC release. (It also doesn't help my excitement that the other day I was spending an hour debugging a test suite order dependency caused exactly by GHC caching something in a global "variable".) |
I'm coming at this from an architectural perspective: when I'm only really interested in the benchmarking numbers insofar as they validate the approach: there is a lot of work being done repeatedly per-package, and all this work does is rediscover information that was already known ahead of time. This is the one of the central observations of Haskell Tech Proposal #60, and its implementation (finalised in 6867dd5) is what allows us to redesign how
I'm not involved in release management, so whatever work I am doing here has little bearing on releases. Note as well that I worked with Matthew at the time to implement the fix for #11373, and am also committed to fixing any bugs that would arise from this PR. I'm more than happy to smoke test the change against any packages you are concerned might be affected to avoid any possibility of breakage. |
|
I would like to rectify my previous comment that the configure phase is rarely on the critical path. In fact, I think the opposite is true: as other parts of the build are paralellisable (in particular with I have done additional benchmarking, this time building |
This PR ensures we configure the compiler program database (constituted of
ghcand attendant programs such asghc-pkg,haddock, toolchain programs such asar,ldetc) ahead of time withincabal-install, so that we don't need to reconfigure programs such asghc,ghc-pkg,haddocketc once per package within Cabal.This should be a net performance win without any other change in behaviour.
Template Α: This PR modifies behaviour or interface
BuildToolPathsensure we don't regress when users pass--with-alex(for example).