Skip to content

Eliminate redundant associative container lookups across the compiler#16606

Open
k06a wants to merge 2 commits intoargotorg:developfrom
k06a:feature/optimize-set-and-map-usage
Open

Eliminate redundant associative container lookups across the compiler#16606
k06a wants to merge 2 commits intoargotorg:developfrom
k06a:feature/optimize-set-and-map-usage

Conversation

@k06a
Copy link
Copy Markdown

@k06a k06a commented Apr 16, 2026

Summary

Bottom-up optimization of compiler internals. No compilation logic, code generation, or optimizer behavior changes. Bytecode is bit-for-bit identical to the baseline (verified with --metadata-hash none).

Commit 1 — Eliminate redundant associative container lookups

  • Replace ~40 double-lookup patterns (count/find + at/operator[]) with single find + iterator reuse, try_emplace, or insert().second across 42 files in libevmasm, libyul, libsolidity, libsolutil.
  • Switch membership-only std::set / std::map to std::unordered_set / std::unordered_map in 8 files where iteration order is never used (MultiUseYulFunctionCollector, FullInliner, InlinableExpressionFunctionFinder, EVMCodeTransform, AsmAnalysis, PathGasMeter, AST.cpp, TypeSystem.h).
  • Use std::erase_if (C++20) in KnownState.cpp.
  • Add section 13 "Associative Containers" to CODING_STYLE.md codifying these best practices.

Commit 2 — Convert hot YulName/FunctionHandle-keyed containers in the Yul optimiser pipeline

Extends the same refactor into the Yul optimiser's hot path, specifically the data-flow / semantics / control-flow analysis layers:

  • Infrastructure: add std::hash<BuiltinHandle> (in Builtins.h) and std::hash<FunctionHandle> (std::variant<YulName, BuiltinHandle>). Route Builtins.h through ASTForward.h so the specialisation is visible wherever FunctionHandle is used.
  • DataFlowAnalyzer: State::value, Environment::keccak (with a custom YulNamePairHash using boost::hash_combine), Scope::variables, m_functionSideEffects → unordered.
  • KnowledgeBase: m_offsets, m_lastKnownValue → unordered. m_groupMembers left as std::map<YulName, std::set<YulName>> on purpose — *group->begin() picks the minimum representative, which must stay deterministic.
  • Semantics / side-effect maps: std::map<FunctionHandle, SideEffects>std::unordered_map end-to-end (SideEffectsPropagator, SideEffectsCollector, MovableChecker, DataFlowAnalyzer, CommonSubexpressionEliminator, EqualStoreEliminator, LoadResolver, UnusedStoreEliminator, UnusedPruner, LoopInvariantCodeMotion, plus FunctionSideEffects test). Same for std::map<YulName, ControlFlowSideEffects> in ControlFlowSideEffectsCollector::functionSideEffectsNamed() and all its consumers.
  • Reference counters: ReferencesCounter::countReferences() and VariableReferencesCounter::countReferences() now return std::unordered_map; all call sites updated.
  • Translation / substitution maps: Substitution, FunctionCopier::m_translations, NameSimplifier::m_translations, VarNameCleaner (m_namesToKeep, m_usedNames, m_translatedNames), SyntacticalEquality::m_identifiers{LHS,RHS}, ExpressionJoiner::m_references, EquivalentFunction{Combiner,Detector}::m_duplicates, Disambiguator::m_translations, AssignmentCounter::m_assignmentCounters, Rematerialiser (m_referenceCounts, m_varsToAlwaysRematerialize), SSAValueTracker::ssaVariables() → unordered.
  • String-keyed and local containers: EVMDialect::m_reservedstd::unordered_set with a transparent hasher (heterogeneous string_view lookup preserved), Object::{objectPaths, dataPaths, subIndexByName}, CompilerContext::m_externallyUsedYulFunctions, Assembly::m_namedTags → unordered. Local std::set<YulName> in CodeTransform::assignedFunctionNames and LoopInvariantCodeMotion::{ssaVars, varsDefinedInScope} → unordered.
  • Double lookups: fix the remaining count()+at() patterns in ControlFlowSideEffectsCollector, ObjectOptimizer, Semantics::containsNonContinuingFunctionCall, Assembly::namedTag, NameSimplifier::findSimplification.

Containers that require ordered iteration (NameCollector::m_names, AssignmentsSinceContinue::m_names, assignedVariableNames(), KnowledgeBase::m_groupMembers inner set, IRGenerationContext::m_usedSourceNames, NameDispenser::usedNames() public API, OptimiserStepContext::reservedIdentifiers, OptimiserSuite::run(..., _externallyUsedIdentifiers)) are deliberately left untouched.

Motivation

Profiling via-IR compilation of OpenZeppelin 5.0.2 shows YulString::operator< at ~2.6% of samples and std::__tree operations on YulName/FunctionHandle at ~4.4% combined — pure overhead from using ordered containers for keys whose iteration order never mattered. Each lookup is O(log n) with O(key length) comparisons; unordered variants make it amortized O(1) with a single hash of the 64-bit handle.

These are mechanical, behavior-preserving changes — the compiler works exactly the same way, it just uses its data structures more efficiently.

Benchmark (via-IR, --optimize, 10 runs each, first run discarded as cold, median of 9)

Project develop this PR Change
OpenZeppelin 5.0.2 11.62 s 10.64 s −8.43%
OpenZeppelin 4.9.0 15.83 s 14.40 s −9.03%
Uniswap V4 (2022) 5.83 s 5.33 s −8.58%

(Absolute times are lower than in the previous revision because this run was done on a cooler machine — the relative deltas are what matters. Compared to v1 of this PR, the second commit adds roughly -1.7 p.p. on OZ 5.0.2 and -7 p.p. on Uniswap V4 / OZ 4.9.0 — the new optimizations target DataFlowAnalyzer / KnowledgeBase / Semantics, which are on the hot path of every via-IR compilation rather than project-specific hot paths.)

Run-to-run spread is also tight (≤ 0.5 s on all three projects), consistent with eliminating cache-miss-heavy tree traversals.

Test plan

  • Bytecode identical (verified with --metadata-hash none on several OpenZeppelin contracts)
  • Build succeeds with no warnings (-Werror enabled)
  • Existing test suite passes (7916 tests, 0 errors)

@github-actions
Copy link
Copy Markdown

Thank you for your contribution to the Solidity compiler! A team member will follow up shortly.

If you haven't read our contributing guidelines and our review checklist before, please do it now, this makes the reviewing process and accepting your contribution smoother.

If you have any questions or need our help, feel free to post them in the PR or talk to us directly on the #solidity-dev channel on Matrix.

…ser pipeline from std::map/set to std::unordered_map/set
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant