Skip to content

All-LogUp auxiliary trace bus constraints#2962

Merged
adr1anh merged 107 commits intonextfrom
adr1anh/bus
May 6, 2026
Merged

All-LogUp auxiliary trace bus constraints#2962
adr1anh merged 107 commits intonextfrom
adr1anh/bus

Conversation

@adr1anh
Copy link
Copy Markdown
Contributor

@adr1anh adr1anh commented Apr 5, 2026

All-LogUp Auxiliary Trace

What this PR does

Every auxiliary-trace bus in the Miden VM used to be described by a bespoke constraint: some were running-product (multiset) checks written out by hand, others were partial LogUp arguments. Each lived in its own per-component file, and the aux-trace generation code was spread across processor/src/trace/{decoder,range,stack,chiplets}/aux_trace*.rs, duplicating the algebra row-by-row in imperative Rust.

This PR replaces all of that with a single unified LogUp formulation, described declaratively. Every bus — the block hash table, op-group table, stack overflow, range check, chiplet request bus, sibling table, hasher perm-link, ACE wiring, kernel ROM, and the rest — is now a branch of one descriptive AIR (MidenLookupAir) expressed against a small closure-based builder trait. The same description is consumed by the verifier's constraint evaluator, by the prover's aux-trace generator, by the recursive-verifier ACE circuit, and by two debug walkers.

The practical consequences:

  • The final layout is 4 extension columns on the main trace and 3 on the chiplet trace (seven total). The 4+3 split is deliberate: in a future multi-trace setup, the main-trace column count lands under the next power-of-two boundary without padding.
  • Only one committed boundary value per trace (two total, with a forced-zero placeholder in the second slot until trace-splitting lands — see NUM_LOGUP_COMMITTED_FINALS). Seven column accumulators collapse into one partial-sum column per trace.
  • The kernel ROM chiplet is now purely lookup-based. Previously it duplicated a row per callsite; now one row per digest carries a multiplicity. For realistic kernels this deletes thousands of chiplet rows.
  • The per-component aux-trace Rust code is gone, not dual-maintained. Aux-trace generation now walks the same LookupAir::eval description the constraint side does.

How the code is laid out

There are two halves, and it's worth keeping them separate in your head while reading.

The generic LogUp framework lives in air/src/lookup/. It is deliberately free of any Miden-specific types — the intent is to eventually extract it into its own crate. You'll find the closure-based builder stack (LookupBuilderLookupColumnLookupGroupLookupBatch), a LookupAir trait modelled on p3_air::Air, and three concrete walkers of a LookupAir description:

  • ConstraintLookupBuilder — symbolic evaluation for the verifier and the prover's quotient pass.
  • ProverLookupBuilder — concrete-row fraction collection, followed by batch inversion and a fused row-major chunk-parallel accumulator.
  • Two debug walkers under debug/: one that validates the shape of a LookupAir (column count, declared-vs-observed degrees per interaction, canonical/encoded equivalence for cached-encoding groups), and one that balances a real concrete trace, producing a human-readable report of unmatched adds/removes. The second is not yet wired into the prover/processor paths; that's one of the follow-ups.

The pattern is essentially Plonky3's AirBuilder pattern: a LookupAir describes itself once, against an interface, and each concrete builder interprets that description differently. It's also essentially a serializer pattern — the shape is written once, many things read it.

The Miden-specific wiring lives in air/src/constraints/lookup/. Here you'll find one file per extension column, under buses/. This is the place to read if you want to know "what bus is this row pushing into, and how does the interaction encode?" Start with miden_air.rs (the thin aggregator), then main_air.rs and chiplet_air.rs for the two halves, then messages.rs for the full catalogue of typed message structs (25 BusId variants). Every bus message is a concrete struct with its own encoding; each bus has a dedicated domain (BusId::*) so multiple linearly-independent buses can sit on one column via distinct bus_prefix prefixes.

The key conceptual unit inside a column is a group. Groups within a column are additively independent — their degrees add. Within a group, the branches are mutually exclusive: only one fires per row, so the group's degree is the max across branches, not the sum. A branch may be a single interaction or a batch of simultaneous interactions sharing one flag. Every add/remove/batch/insert call is annotated with its numerator and denominator degree, and those annotations are cross-checked by the validator against the symbolic degree actually accumulated. That annotation is what made the current packing (everything inside degree 9) tractable; the same metadata could drive an automated packer later.

Messages and domain separation

We used to derive bus labels by reverse-bitting operation codes. That's gone. Each distinct bus interaction type is now a named BusId variant, cast directly into the bus_prefix table. We've also split several messages that were previously sharing an id — for example, the hasher's various input/output paths now each carry their own variant rather than sharing a generic "hasher message". messages.rs is the centralised place to review domain separation; some of it was Claude-generated and there are still obvious simplifications we'd like to fold in.

Outer / boundary interactions

Public-input-driven terms (program hash seed into the block hash queue, the log-precompile transcript bookends, kernel ROM initialisation from the variable-length public inputs) are emitted through a separate eval_boundary hook on LookupAir that takes a BoundaryBuilder. The debug trace walker consumes this directly. The verifier's scalar check (ProcessorAir::reduced_aux_values) now also drives through eval_boundary via a BoundaryBuilder reducer — so this code path is unified. The MASM recursive verifier still has its own copy of these boundary formulas; unifying the third source is a follow-up.

ACE and MASM recursive verifier

The ACE circuit's LogUp boundary identity (0 = Σ aux_bound + c_block_hash + c_log_precompile + c_kernel_rom) is added in air/src/ace.rs. On the MASM side, aux_trace.masm now does a single Poseidon2 absorb of [COM, W0] (two boundary felts, second always zero), and public_inputs.masm has been reworked to sum the variable-length public inputs as LogUp fractions rather than running the old multiset reduction. The constraint evaluation MASM (constraints_eval.masm) is regenerated by a new regenerate-constraints binary in crates/lib/core/src/bin/.

Tests

The legacy per-component aux-trace unit tests (processor/src/trace/tests/{decoder,stack,range,hasher,chiplets/*}.rs) were deleted alongside their targets. In their place are two new files: lookup_harness.rs builds a column-blind, subset-based expectation DSL over (multiplicity, encoded_denominator) pairs, and lookup.rs is an end-to-end smoke test that drives a tiny real trace through the prover's collection pipeline. The existing tests were heavily rewritten through this harness, much of it Claude-assisted — that's one of the areas worth the most scrutiny. Restoring granular per-component coverage is a blocker before we flip off WIP.

Reviewing this PR

For an efficient walk, I'd suggest:

  1. air/src/constraints/lookup/miden_air.rs and messages.rs — establish the final shape and the bus catalogue.
  2. air/src/constraints/lookup/buses/*.rs — one file per column. Each is the authoritative description of one bus's semantics; read these instead of the per-component design docs (which haven't caught up).
  3. air/src/lookup/builder.rs — the four builder traits. Everything else in air/src/lookup/ is an implementation of these against a specific target (constraint eval, prover, debug).
  4. air/src/lookup/constraint.rs + prover.rs — the two production walkers.
  5. ACE + MASM: air/src/ace.rs, crates/lib/core/asm/sys/vm/{aux_trace,public_inputs,constraints_eval}.masm.
  6. Processor tests: processor/src/trace/tests/lookup{,_harness}.rs and the rewritten per-component files.
Deeper walkthrough — design rationale and nuances

Why a new description layer at all

The previous structure described each bus independently, and nothing enforced consistency between what the constraint code asserted and what the aux-trace builder computed. Different components used different conventions (multiset vs. LogUp), different column layouts, and different ad-hoc label schemes. After this PR, one description drives four concrete uses: the verifier's constraint evaluator, the prover's quotient pass, the prover's aux-trace generator, and the recursive ACE circuit. This unification is the point — not LogUp itself, which was only a means to get there.

Because the description is canonical, the current docs under docs/src/design/lookups/ and the per-chiplet bus pages no longer match the code. We have an issue open for a docs rewrite. The argument for not going through another round of exhaustive LaTeX is that the new code is much more self-describing: the per-bus file names, BusId enum, LookupMessage structs, and Deg annotations spell out the interaction inventory directly. Docs can lean on that and stop duplicating formulas.

How to read a bus file

Start at the top with MidenLookupAir. It defines the lookup argument both for the main trace and the chiplet trace. At the moment we have two sub-AIRs (MainLookupAir, ChipletLookupAir), each carrying its own declared shape: the number of columns, the max interactions per column (which hints the adapter how many fractions to buffer per row), and the bus-ID count required for deriving challenges and sizing the bus_prefix table.

Each sub-AIR also carries a lookup-specific context — the collection of flags that gate interactions. This is built once per eval call and shared between all the columns in that AIR. The flags are designed to express mutual exclusivity: if two flags in a group cannot be simultaneously active, the group's degree is the max of the branches rather than the sum. The context is very similar to the old OpFlags / ChipletFlags but has been re-implemented as its own type for two reasons. First, the constraint-path construction uses polynomial arithmetic (the default), but the prover-path adapter can override the constructor to use boolean arithmetic on the concrete 0/1 decoder bits, which is meaningfully cheaper. Second, it pins down the smaller surface the LogUp code actually needs, independently of the full OpFlags. The override is the hook for the follow-up where we want to hint to the aux-trace generator that all flags on a row are inactive so the whole row can be skipped — padding rows are free wins there.

Walking through a column: you start with a column, containing groups. Groups in the same column are independent — their degrees add. Within a group the branches are mutually exclusive; only one fires per row, and the group's degree is the max. A branch can be a single interaction or a batch of simultaneous interactions sharing the outer group's flag. Every interaction carries a name and a declared (n, d) degree pair, both of which are cross-checked by the validator. Naming and annotating is the reason we were able to find the current packing at all: we extracted the per-interaction degrees and ran a small optimisation pass to find a tighter column layout. That same metadata can drive a brute-force packer next time we need to add an interaction.

The builder pattern

The entry point for the framework is LookupAir — a structure that describes itself to an interface. That interface is the LookupBuilder stack (column → group → batch). This is the same pattern as Plonky3's AirBuilder, and morally the same as serialisation: a struct describes itself once, and different interpreters do different things with that description.

There are three interpretations:

  1. ConstraintLookupBuilder (air/src/lookup/constraint.rs) wraps a LiftedAirBuilder. As it walks the description, it keeps a single rational-function accumulator per scope (column, group, or batch). When a scope closes, its accumulator folds into its parent — the walk is linear, no intermediate storage. Because the target is a generic AirBuilder, the resulting expression can be fed to the prover's quotient computation, the verifier's OOD evaluation, and the ACE circuit evaluation — three consumers of the same description without code duplication. The formulas for how a group composes into its parent are documented in the file.

  2. ProverLookupBuilder (air/src/lookup/prover.rs) is the aux-trace path. It runs in three passes: first it walks the AIR to collect every (multiplicity, denominator) pair per column into a flat buffer (LookupFractions). Second it batch-inverts the denominators. Third it sums the inverses into per-column partial-sum accumulators. Only one auxiliary value needs to be sent per trace at the end — a trick enabled by the generic description, and the reason we went from 7+ committed finals down to 1 per trace (2 total, with the trailing zero for forward-compatibility). Only active branches (flag ≠ 0) get the closure evaluated, saving arithmetic on rows where the interaction isn't firing. The code hasn't been benchmarked carefully yet but it already parallelises; there's headroom. One follow-up on the to-do list is to change the aux-builder traits in our lifted-stark layer to take the LookupAir directly — a larger refactor that would eventually push these lookups purely onto the processor trait.

  3. Debug walkers (air/src/lookup/debug/). There are two, for two different kinds of check:

    • debug/trace/ balances a full concrete trace. It walks the AIR while summing all interactions, and emits a human-readable report of which rows have unmatched adds/removes. This is the replacement for the old bus-debugger infrastructure; the mechanism is in place but we haven't hooked it into the prover/processor paths yet. This also requires eval_boundary to be handled correctly, which it now is.
    • debug/validation/ validates the shape of the description: number of columns, declared degrees match observed symbolic degrees, encoding equivalence for cached-encoding groups (more on this below), scope discipline. Single entry point: MidenLookupAir::validate(layout).

The cached-encoding escape hatch

The biggest tension in this design was between two performance axes: aux-trace generation efficiency and constraint-evaluation efficiency. The canonical description is optimised more heavily for the former — it reads cleanly and only fires closures on active flags. Constraint evaluation, however, must evaluate every branch symbolically, and the canonical description has a lot of repetition. The hasher responses are the worst offender: each variant uses a linear combination of (α + β·addr + β²·node_idx)-shaped prefixes that would ideally be memoised across branches.

To address this, a group can optionally carry a second, arithmetic-form description alongside its canonical one. This is the group_with_cached_encoding API. The constraint-path adapter consumes the encoded form where supplied, giving the expression tree a much better chance at common-subexpression elimination (notably in the ACE codegen, which does aggressive CSE). Part of the validation walker's job is to ensure both representations are equivalent — it walks both paths and verifies the resulting expressions match using the Schwarz–Zippel lemma. We've only applied this to the hasher chiplet so far, where the per-variant repetition is worst; it's a follow-up to sweep the rest.

The MASM and ACE side

The ACE circuit was modified to remove support for multiset reductions and to match the new aux-value layout. We build it with only as many boundary values as there are traces — so 1 for the current single-trace setup, but we allocate 2 (with the second forced zero) as a forward-looking hack, because the MASM recursive verifier already needs to absorb 2 boundary felts in its forthcoming trace-split layout. Updating MASM later to drop the slot is cheaper than updating it twice.

On the MASM side, the layout now expects two extension-field elements for the bus boundary. The larger change was to public_inputs.masm: variable-length public inputs now only need to be summed up as LogUp fractions rather than run through a multiset reduction. That collapses a chunk of code. As a follow-up, we want to reduce all the outer-boundary values in MASM, since they all have the same shape now — much more practical than it used to be.

Despite the constraint-evaluation builder introducing a lot of structural duplication (each branch evaluates its own encoding), the number of inputs and eval gates in the ACE circuit is very similar to before. The ACE codegen's CSE pass absorbs the redundancy.

The test rewrite

Almost every test under processor/src/trace/tests/ was rewritten. The reason is blunt: the aux-trace format changed, so the old assertions were meaningless. The rewrite tried to shift tests from asserting on the final trace shape to asserting on the semantic interactions — each test now declares "for this row, I expect these (multiplicity, message) pairs" and the harness checks that as a multiset subset of the actual prover emissions. Tests are more descriptive as a result. The harness is processor/src/trace/tests/lookup_harness.rs.

Most of the rewriting was Claude-assisted under time pressure, and that's the section of the PR most in need of close review. On the positive side, the harness is column-blind and subset-based, so these tests should survive any future column repack.

Prover changes are mostly nil

The prover itself barely changes. Almost all the logic moved into air/: the aux-trace generator is now just a thin wrapper (MidenLookupAuxBuilder) that calls into build_lookup_fractions + accumulate. The reduction comes from deleting the per-component aux-trace code in the processor — it's all delegated to a single point now.

Follow-ups

Landed since the original PR description:

  • Kernel ROM chiplet reworked onto multiplicity-based LogUp.
  • Deg annotation (name + degree) on every builder call.
  • Bus labels replaced by the BusId enum with direct prefix indexing.
  • Single-column partial-sum accumulator per trace (7 → 2 committed finals).
  • Prover-side op-flags fast path (MainLookupBuilder::build_op_flags override hook).

Outstanding, in rough priority order:

  • Revive per-component processor aux-trace tests. Today only the end-to-end oracle survives, which doesn't localise regressions.
  • Rewrite the bus design docs. docs/src/design/lookups/{multiset,logup}.md and per-chiplet pages still describe the old running-product world and now contradict the code.
  • Wire the real-trace bus debugger into the prover/processor paths. The infrastructure is there; today a silently-imbalanced bus just fails verification.
  • Unify boundary emission. Rust, ACE and MASM still carry three copies of the boundary formulas.
  • Padding-row early-exit hint for trace generation.
  • Cached-encoding optimisation for constraint eval. The hook is there; the widest use case (hasher responses) would save ~58% per design estimates.
  • OpFlagsTrace variant to avoid building all 128 flags polynomially on the trace-generation path.

FG's web intro

https://gisthost.github.io/?10dbb2d98054b6b78bc462e39ecd1929

Test plan

  • make test-fast followed by a targeted make test-air and make test-processor.
  • cargo test -p miden-air for the lookup validator + bus degree inventory.
  • End-to-end proof generation — pending the remaining trace-side integration.

adr1anh and others added 6 commits April 14, 2026 14:21
Squashed snapshot of 24 commits introducing the new closure-based LogUp
lookup-argument API in air/src/constraints/lookup/, plus its supporting
infrastructure (logup, logup_msg, the LookupAir / LookupBuilder /
LookupColumn / LookupGroup / LookupBatch trait stack, ConstraintLookupBuilder
and ProverLookupBuilder adapters, LookupChallenges, MidenLookupAir) and the
seven per-bus emitters in lookup/buses/.

Squashed for clean rebase onto adr1anh/constraint-simplification — the
intermediate commits referenced types (MainTraceRow, LiftedAirBuilder, the
old trace::Challenges) that 2856 reshapes, so a per-commit replay is not
viable. Granular history preserved on adr1anh/bus.pre-rebase-backup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Post-rebase Phase 3 of the lookup-module adaptation:

- Replace LookupChallenges struct with a type alias for trace::Challenges (the
  unified 2856 challenge type with fixed-size bus_prefix / beta_powers arrays).
- Migrate LookupMessage trait to use trace::Challenges in place of the dropped
  LookupChallenges. ConstraintLookupBuilder / ProverLookupBuilder / DualBuilder
  drop the max_message_width / num_bus_ids constructor args (sizes are now
  fixed at compile time via NUM_BUS_TYPES / MAX_MESSAGE_WIDTH).
- bus_prefix() trait method takes usize instead of u16; lookup::bus_id is now
  a thin re-export shim over trace::bus_types.
- Rewrite lookup/buses/{block_stack,block_hash_and_op_group,chiplet_requests,
  range_logcap,wiring}.rs against the typed MainCols struct: replace
  MainTraceRow indexed access with DecoderCols / StackCols / RangeCols /
  SystemCols field accessors, switch OpFlags::new from the
  ExprDecoderAccess wrapper to the (decoder, stack, decoder_next) signature
  (which folds end_next/repeat_next/halt_next into a single struct), and
  recover the ACE chiplet flag with the new virtual s0 = 1 - s_ctrl - s_perm
  formula.
- Add bus parameters to every Challenges::encode call in logup_msg.rs (each
  inherent Msg::encode now passes its bus type from trace::bus_types).
- Stub lookup/buses/{chiplet_responses,hash_kernel}.rs with no-op emit_*
  functions: the original implementations referenced the pre-2856 hasher
  chiplet (32-row P_CYCLE_ROW_0/31, hasher::flags::f_mv etc.) and need to be
  rewritten against the new hasher_control + permutation split + 16-row
  HasherPeriodicCols. Originals are preserved on adr1anh/bus.pre-rebase-backup.
- Delete dead constraints/logup.rs (the legacy RationalSet / Batch / Column
  algebra is no longer referenced; superseded by lookup module).
- Disable constraints/degree_audit.rs tests with cfg(any()) until they're
  ported to MainCols / OpFlags::new(...) / Challenges::encode(bus, [...]).

The two MidenLookupAir canary tests pass: the degree budget (with the two
stubbed columns counted) and the cached-encoding equivalence check.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…56 hasher

Replace the two stubbed bus emitters with full implementations against the
2856 controller/permutation hasher split:

- hash_kernel.rs: sibling table (BUS_SIBLING_TABLE) now uses the
  s_ctrl-gated (s0*s1*s2 / s0*s1*!s2) MU/MV pair from 2856's running
  product instead of the old 32-row cycle-row filter; the four absorb-
  after MVA/MUA variants are gone. ACE memory reads (BUS_CHIPLETS) keep
  the same NUM_ACE_SELECTORS-relative offsets. SiblingMsgBitZero/BitOne
  pick up mrupdate_id at beta^1 for cross-MRUPDATE domain separation,
  matching the SIBLING_TABLE encoding.

- chiplet_responses.rs: the 7 hasher response variants now mirror 2856's
  compute_hasher_response — sponge_start/sponge_respan replace the old
  f_bp/f_abp pair and use the new is_boundary trace column instead of
  cycle_row_0/cycle_row_31; mp/mv/mu/hout/sout keep their previous
  semantics under the new (s_ctrl, hs0..hs2, is_boundary) gating. The
  bitwise/memory/ACE/kernel-ROM responses use the runtime-muxed
  encoding through the existing *ResponseMsg types so the C1 transition
  stays at degree 8.

Selector wiring across both files mirrors lookup/buses/wiring.rs:
s_ctrl = chiplets[0], s_perm = perm_seg, virtual s0 = 1 - s_ctrl - s_perm,
and s1..s4 = chiplets[1..5]. Bitwise k_transition is read via the typed
PeriodicCols::bitwise.k_transition borrow.

Tests:
- miden_lookup_air_degree_within_budget passes (max deg 9, hash_kernel
  lands at 7, chiplet_responses at 8).
- miden_lookup_air_cached_encoding_equivalence passes (no cached calls
  on the two ported columns; chiplet_requests still exercises the
  dual-path machinery).

Also picks up an unrelated rustfmt sweep + dead-import cleanup from
running `make lint`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Collapse duplicated per-bus precompute into two shared contexts built once
per `MidenLookupAir::eval`:

- `MainTraceContext` holds a single `OpFlags` instance for the 4 main-trace
  buses (was built independently in each bus).
- `ChipletTraceContext` holds a `ChipletActiveFlags` snapshot mirroring the
  active-flag block of `build_chiplet_selectors` for the 3 chiplet-trace
  buses (was recomputed manually in each).

Per-bus cleanups: swap raw `chiplets[OFFSET + FIELD_IDX]` indexing for the
typed `local.bitwise/memory/ace/kernel_rom/controller` borrow views; drop
premature `_e: LB::Expr` named bindings in favor of `Var` captures; use
`.not()` (via `BoolNot`) for boolean negation of known-boolean columns;
delete the `CreqCtx` struct in `chiplet_requests.rs` and inline everything
into a flat `col.group` body (matching `block_hash_and_op_group`'s pattern,
dropping the cached-encoding optimization).

No behavioral change: the degree-budget and cached-encoding-equivalence
tests in `miden_air.rs` both pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…cleanups

Split the monolithic MidenLookupAir into MainLookupAir (4 main-trace
columns) and ChipletLookupAir (3 chiplet-trace columns) behind per-air
extension traits (MainLookupBuilder / ChipletLookupBuilder). The existing
MidenLookupAir is now a thin aggregator delegating to both, so the
degree-budget and cached-encoding tests keep working unchanged while a
future enum-dispatch wrapper can reach for either sub-air independently.

Each sub-air owns a concrete context struct (MainBusContext /
ChipletBusContext) built via a single builder hook (build_op_flags /
build_chiplet_active), so the prover adapter can later override the
polynomial construction with a boolean fast path without touching any
emitter. ConstraintLookupBuilder, ProverLookupBuilder (gated on F = Felt),
and DualBuilder each pick up empty impls of both extension traits.

Bus emitter cleanups:
- Hoist every message-producing computation out of the struct literal
  into closure-local let bindings; each emit_* closure now ends with a
  flat field-punned constructor.
- Replace Into::<LB::Expr>::into(x) turbofishes with x.into() (via typed
  bindings where arithmetic would otherwise be ambiguous).
- Replace .map(Into::into) / .map(|v| v.into()) with .map(LB::Expr::from)
  for concrete From-function-pointer conversion.
- Drop _e / _var suffixes via shadowing; drop redundant : LB::Expr
  annotations wherever the use site already pins the type.
- Introduce h_0 / h_1 bindings for the decoder's [Var; 8] rate halves
  and rate_0 / rate_1 / cap bindings for the chiplet controller's
  [Var; 12] sponge state ([RATE0, RATE1, CAP] convention).
- Replace the 7-bit op_bits fold in block_hash_and_op_group with
  horner_eval_bits.
- Move payload computation *inside* message closures so the prover path
  skips dead work on flag-zero rows.

Drop the scattered #[expect(dead_code)] / #[expect(unused_imports)]
attributes across the lookup tree in favor of a single module-level
#![allow(dead_code, unused_imports)] in lookup/mod.rs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Removes the entire Task #9 cleanup block that the file has been flagging
for deletion since the LookupMessage-trait rewrite (Task #12 / Amendment B).
All consumers in lookup/buses/*.rs reach for the structs directly and
encode via the LookupMessage trait from lookup/message.rs; the legacy
LogUpMessage trait plus its blanket impls had no live callers.

Deleted items (all flagged dead_code by clippy, all verified via grep to
have zero external references):

- `pub trait LogUpMessage` and all of its inherent `encode` method
  delegates on every *Msg type (HasherMsg, MemoryMsg, BitwiseMsg,
  BlockStackMsg, BlockHashMsg, OpGroupMsg, KernelRomMsg, AceInitMsg,
  RangeMsg, LogCapacityMsg, AceWireMsg, MemoryResponseMsg,
  KernelRomResponseMsg, BitwiseResponseMsg).
- `impl_logup_message!` macro + all 15 invocations.
- `struct SiblingMsg` (replaced by `SiblingMsgBitZero` / `SiblingMsgBitOne`
  in the new bus path).
- `struct OverflowMsg` + its LookupMessage impl (stack overflow bus is
  not emitted by the current 7-column LogUp argument).
- `KernelRomMsg::INIT_LABEL` const and `KernelRomMsg::init` constructor
  (legacy naming from before the header-as-builder pattern).
- Four legacy per-label memory response variants
  (MemoryResponseReadElementMsg, MemoryResponseWriteElementMsg,
  MemoryResponseReadWordMsg, MemoryResponseWriteWordMsg), replaced by
  the runtime-muxed MemoryResponseMsg.
- Two legacy per-op kernel ROM response variants
  (KernelRomResponseCallMsg, KernelRomResponseInitMsg), replaced by the
  muxed KernelRomResponseMsg.
- Two legacy per-op bitwise response variants (BitwiseResponseAndMsg,
  BitwiseResponseXorMsg), replaced by the muxed BitwiseResponseMsg.

Drops the now-unused `bus_types` import (all surviving code uses
`super::lookup::bus_id::BUS_*` directly).

File shrinks 700 lines. `cargo check -p miden-air` finishes with zero
warnings (down from 18). `miden_lookup_air_degree_within_budget` and
`miden_lookup_air_cached_encoding_equivalence` both pass unchanged.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
adr1anh and others added 11 commits April 14, 2026 21:05
Demote 3 orphan doc comments (above section banners, documenting nothing)
to plain comments to silence clippy::empty_line_after_doc_comments, and
let rustfmt collapse empty `impl<E> Msg<E> {}` blocks and trim trailing
newlines. No behavioural change.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Wires the prover-side LogUp aux-trace collection phase onto a single flat
buffer shared across all rows and columns, ready to feed a fused
batch-inversion + partial-sum pass as follow-up.

LookupAir trait gains `column_shape() -> &[usize]` — the
mutual-exclusion-aware upper bound on fractions any one row can push per
column. Each bus emitter declares its own `MAX_INTERACTIONS_PER_ROW`
const alongside the `emit_*` fn, with a comment explaining the max over
its mutually exclusive branches:

  M1 (block_stack + range_table)        = 3
  M_2+5 (block_hash_queue ∪ op_group)   = 7
  M3 (chiplet_requests)                 = 4
  M4 (range_stack + logpre_capacity)    = 4
  C1 (chiplet_responses)                = 1
  C2 (hash_kernel)                      = 1
  C3 (ace_wiring)                       = 3

Main/Chiplet/Miden LookupAirs expose the per-column shapes via static
arrays assembled from the per-emitter consts.

New `fractions.rs` introduces `LookupFractions<F, EF>`: a single flat
`fractions: Vec<(F, EF)>` and a single flat `counts: Vec<usize>` of
length `num_rows * num_cols`, row-major. No padding. Row r's contribution
to column c is `fractions[prefix .. prefix + counts[r * num_cols + c]]`,
where `prefix` is the running sum of earlier counts. `Vec::with_capacity`
sizes everything up front (`num_rows * Σ shape` for fractions,
`num_rows * num_cols` for counts) so the hot row loop never
re-allocates. `accumulate_slow` is the reference oracle — one
`try_inverse()` per fraction, walks the flat buffers with one cursor and
`counts.chunks(num_cols)` in lockstep, which matches the memory-access
pattern the fused fast path will use.

ProverLookupBuilder refactored to own `&mut LookupFractions`. `column(f)`
split-borrows `fractions` and `counts` disjointly, snapshots the vec
length before/after the closure, debug-asserts `pushed <= shape[col]`,
and appends the count. ProverColumn/Group/Batch push logic is unchanged —
they still thread a `&mut Vec<(F, EF)>` and call `.push((m, v))`.

Tests: two unit tests in `fractions.rs` (handcrafted `accumulate_slow`
round-trip + capacity reservation), one end-to-end smoke test in
`prover.rs` using a custom `SmokeAir` (2 columns, handcrafted eval body
that respects its own shape on every row — random row-data trips the
mutex-aware shape check on the real MidenLookupAir, so the smoke test
avoids that path).

Out of scope: fused batch-inversion + partial-sum fast path, rayon
row-loop parallelism.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds `build_lookup_fractions`, the top-level prover-side entry point that walks a
real `RowMajorMatrix<Felt>` through `ProverLookupBuilder` and returns a populated
`LookupFractions` buffer. Per row it builds a zero-copy two-row `RowWindow` over
the flat matrix storage with wraparound on the last row, composes the per-row
periodic slice by indexing each periodic column at `r mod its period` (reusing a
single `Vec<Felt>` across rows — no per-iteration allocation), and runs
`air.eval(&mut ProverLookupBuilder)`. The `for<'a> A: LookupAir<...>` HRTB lets
one call site work with any `'a` the row loop needs.

Public API surface for downstream crates:
- `miden_air::lookup` — new re-export module exposing `LookupChallenges`,
  `LookupFractions`, `MidenLookupAir`, `ProverLookupBuilder`, `accumulate_slow`,
  and `build_lookup_fractions`.
- `MidenLookupAir` relaxed from `pub(crate)` to `pub`.
- `constraints` stays private; only the items re-exported through `lookup` leak.

Integration test in `processor/src/trace/tests/lookup.rs`: runs a 6-op basic
block (`Pad/Add/Pad/Mul/Drop/...`) through `build_trace_from_ops`, pipes the
materialized main trace into `build_lookup_fractions` + `accumulate_slow`, and
asserts shape bookkeeping, non-degenerate collection, and zero-panic
accumulation. Observed on the tiny span: columns M1, M4, C2, C3 close to zero;
columns M_2+5, M3, C1 don't close individually because they're paired across
different columns in their respective buses. Terminal values are printed to
stderr for manual inspection — hardening into `assert_eq!` checks is deferred
until the follow-up round-trip oracle lands.

Not yet covered by the test: symbolic AIR-constraint validation via
`miden_crypto::stark::debug::*`. That needs a full aux-column matrix in the
legacy shape, which sits behind the still-pending fused batch-inversion +
partial-sum fast path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds two independent computations of the per-row aux-column delta
`Σ m_i · d_i^{-1}` and a test that asserts they agree bit-exactly on a
real program's trace, closing the loop from `build_lookup_fractions`
through to the constraint-path algebra.

**Fused accumulator** (`fractions.rs::accumulate`): replaces the
per-fraction `accumulate_slow` with a single-inversion Montgomery-trick
path via `miden_core::field::batch_multiplicative_inverse` (already
chunked internally for ILP). One field inversion amortized over the
whole trace instead of N. Memory-access pattern is otherwise identical
to `accumulate_slow`, so both paths can coexist as cross-checks.

Unit test `accumulate_matches_accumulate_slow_random` drives a
deterministic LCG to build a random fixture (32 rows × shape [2,1,3])
and asserts `accumulate` produces output bit-exactly equal to
`accumulate_slow` per (col, row).

**Constraint-path oracle** (`column_oracle.rs`, new ~340 lines):
`ColumnOracleBuilder` implements `LookupBuilder` by running the same
`(U_g, V_g)` per-group algebra used by `ConstraintLookupBuilder` /
`DualGroup`, and folds groups into per-column `(U_col, V_col)` via the
standard cross-multiplication rule from
`ConstraintColumn::fold_group`:

  V_col_new = V_col · U_g + V_g · U_col
  U_col_new = U_col · U_g

starting from `(ONE, ZERO)`. The paired `collect_column_oracle_folds`
driver walks a `RowMajorMatrix<Felt>` row by row (mirroring
`build_lookup_fractions`) and returns per-row folded pairs for
downstream comparison. `OracleColumn` / `OracleGroup` / `OracleBatch`
mirror the `DualGroup` / `DualBatch` math verbatim, with `OracleGroup`
also implementing `EncodedLookupGroup` so the cached-encoding path
type-aligns (only the canonical closure runs — cached-encoding
equivalence is already covered by `miden_lookup_air_cached_encoding_equivalence`).

**Cross-check integration test**
(`processor/src/trace/tests/lookup.rs::build_lookup_fractions_matches_constraint_path_oracle`):
runs the same Pad/Add/Pad/Mul/Drop span as the plumbing smoke test
through both paths and asserts
`aux[col][r+1] - aux[col][r] == V_col · U_col^{-1}` for every
(row, col). Catches divergence anywhere in the pipeline — bus
emitters, flag handling, message encoding, collection, batched
inversion, running-sum accumulation, constraint-path group fold. On
the tiny span the prover path and the oracle agree bit-exactly across
all 7 columns and every row.

**Public API**: `miden_air::lookup` re-exports `accumulate`,
`ColumnOracleBuilder`, `collect_column_oracle_folds`, and `LookupAir`
alongside the existing collection-phase surface.

Out of scope (follow-up milestones):
- Wiring `MidenLookupAir::eval` into `ProcessorAir::eval` and flipping
  `AUX_TRACE_WIDTH` to retire the legacy multiset path.
- End-to-end prove/verify test against a real program.

Verification: `cargo test -p miden-air --lib constraints::lookup` (6/6),
`cargo test -p miden-processor --lib trace::tests::lookup` (2/2),
`make test-fast` 2890 passed, `make lint` clean,
`cargo clippy -p miden-air -p miden-processor --all-targets` zero
warnings.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replaces `fractions.rs::accumulate` with a row-major, chunk-parallel,
fused-inversion variant. The old path ran three disjoint full sweeps
(extract denominators, global `batch_multiplicative_inverse`, per-column
walk) into a fragmented `Vec<Vec<EF>>`, with the accumulation walk fully
sequential even under `concurrent`. On realistic traces the walk is the
dominant wall-clock cost, so parallelizing it was the real win.

**New shape**: one contiguous `RowMajorMatrix<EF>` of `(num_rows + 1) *
num_cols` elements. Row 0 is the zero initial condition; row `r + 1`
column `c` holds the running sum of `m_i · d_i⁻¹` through main-trace row
`r`. Within a row all columns live in adjacent memory, so the per-row
inner loop is a tight sequential write.

**Algorithm**:

1. *Prepass* — sequential O(num_rows · num_cols) `usize` scan builds
   `row_frac_offsets` so any row range `[lo, hi)` can locate its flat
   fraction slice in O(1).
2. *Phase 1 — chunked fused walk* (parallel under `concurrent`, else
   serial): rows are split into groups of `ACCUMULATE_ROWS_PER_CHUNK =
   512` (mirroring `trace::main_trace::ROW_MAJOR_CHUNK_SIZE`). Each chunk
   runs an inline Montgomery inversion into a chunk-local scratch `Vec`
   via the new `invert_denoms_in_place` helper — forward prefix-product
   pass, one real `try_inverse`, backward sweep that overwrites scratch
   with per-element inverses — then walks its rows forward maintaining a
   `num_cols`-wide running register, writing local prefix sums (starting
   from ZERO) directly into its output slice and emitting per-column
   totals.
3. *Phase 2 — sequential exclusive prefix scan* over
   `num_chunks · num_cols` chunk totals produces the per-chunk global
   offset each local prefix needs. Cheap (summary data only, no
   inversion) and the single serial-dependency point in the whole
   algorithm.
4. *Phase 3 — parallel offset add* (parallel under `concurrent`): second
   pass over the same row-chunked output slices in lockstep with
   `chunk_offsets`, adding the per-column offset into every row. Chunk 0
   is skipped (zero offset). Memory-bandwidth limited, trivially
   parallel, no allocation, no inversion.

Because partial sums are additive (not multiplicative), phases 2 and 3
collapse to a prefix scan + vector add instead of the global rescale a
product-prefix formulation would need.

**Allocation discipline**:

- Output: one `Vec<EF>` of length `(num_rows + 1) * num_cols`.
- Per-chunk scratch `Vec<EF>` sized to that chunk's fraction count
  (~1.5 K EFs ≈ 24 KiB at the default tuning, L1-resident). Allocated
  once per chunk inside the closure; rayon's work stealing amortizes
  across threads.
- `row_frac_offsets` / `chunk_totals` / `chunk_offsets`: small summary
  buffers, `usize` or `num_chunks · num_cols` EFs.
- No more `denoms: Vec<EF>` or `d_inv: Vec<EF>` intermediates, and no
  call into `batch_multiplicative_inverse` (which allocates a fresh
  `Vec` and internally parallelizes, which would nest poorly inside our
  own `par_chunks_mut`).

**Feature gating**: follows `air/src/trace/main_trace.rs::to_row_major`
exactly — `#[cfg(feature = "concurrent")]` swaps `par_chunks_mut` for
`chunks_mut`, both phase 1 and phase 3 bodies are the same. Serial
fallback is still chunked + fused + bounded-scratch, so strictly better
than the old global-Vec path even without rayon.

`accumulate_slow` is unchanged — it stays the `Vec<Vec<EF>>` reference
oracle.

**Tests** (`air/src/constraints/lookup/fractions.rs`):

- `accumulate_matches_accumulate_slow_random` — single-chunk random
  fixture (32 rows), exercises only phase 1.
- `accumulate_multi_chunk_matches_accumulate_slow` *(new)* — random
  fixture at `ACCUMULATE_ROWS_PER_CHUNK * 3 + 7` rows, deliberately
  picking a trailing short chunk so phase 2's prefix scan and phase 3's
  offset add are both exercised and any off-by-one in last-chunk bounds
  is caught. This is the key regression test — the single-chunk path
  can't cover phases 2 or 3.
- `accumulate_empty_trace` *(new)* — `num_rows = 0` returns a 1-row,
  `num_cols`-wide zero matrix without touching the inversion path.
- Shared `Lcg` / `random_fixture` / `assert_matrix_matches_slow` helpers
  factored out of the original single-chunk test.

`processor/src/trace/tests/lookup.rs::build_lookup_fractions_matches_constraint_path_oracle`
is migrated to read deltas from the matrix (`aux.values[(r + 1) * width
+ col] - aux.values[r * width + col]`) instead of the old
`aux[col][r + 1] - aux[col][r]` Vec-of-Vec indexing.

**Out of scope** (follow-up milestones from commit 64de7a9):
- Wiring the matrix output into `ProcessorAir::eval` and flipping
  `AUX_TRACE_WIDTH` to retire the legacy multiset path.

Verification: `cargo test -p miden-air --lib constraints::lookup` (8/8,
includes `accumulate_multi_chunk_matches_accumulate_slow`);
`cargo test -p miden-processor --lib trace::tests::lookup` (2/2);
`make test-fast` 2892/2892; `cargo clippy -p miden-air --all-targets`
zero warnings with and without `--features concurrent`; `make lint`
clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Wire ProcessorAir through the closure-based MidenLookupAir/LogUp lookup
argument by introducing a stateless `MidenLookupAuxBuilder` ZST and
swapping the legacy multiset `AuxTraceBuilders` out of `prove_stark`.

The new builder runs `build_lookup_fractions` + `accumulate` per call
and slices the (num_rows + 1)-row accumulator into the AuxBuilder return
shape (aux trace = first num_rows rows starting at ZERO; committed
finals = last row holding the full running sum). Periodic columns are
recomputed from `PeriodicCols::periodic_columns()` and challenges are
constructed via `LookupChallenges::new(α, β)`, mirroring the
constraint-path adapter exactly.

ProcessorAir::eval now invokes `MidenLookupAir.eval` through a
`ConstraintLookupBuilder` instead of the legacy `enforce_bus`, and
`aux_width()` / `num_aux_values()` report the new 7-column LogUp width.

Milestone B intentionally stubs three things to land the integration
plumbing without touching the open-bus algebra:

- LogUp boundary checks (when_first_row / when_last_row) are commented
  out in `ConstraintColumn::column`; the per-row transition constraint
  Δ·U − V = 0 stays live.
- `ProcessorAir::reduced_aux_values` returns the verifier identity
  (prod = ONE, sum = ZERO) — the only safe stub since the verifier
  asserts `is_identity()` on the accumulated reduction.
- `ace::reduced_aux_batch_config()` returns an empty config, collapsing
  the batched ACE circuit to `constraint_check + γ·(1−1) + γ²·0 =
  constraint_check` (verified `batch_reduced_aux_values` handles the
  empty case cleanly).

`ExecutionTrace::check_constraints` is rewired to the new builder so
the debug constraint checker stops feeding the AIR an 8-column legacy
aux trace, and `test_poseidon2_prove_verify` now skips the recursive
verify step (the recursive verifier MASM still depends on the legacy
boundary terms; restoration is the follow-up milestone). `RELATION_DIGEST`
is regenerated alongside `constraints_eval.masm` so the snapshot test
and AIR↔MASM transcript binding stay consistent.

Verified with the cross-check oracle test, full miden-air + miden-processor
lib suites, and `test_blake3_256_prove_verify` end-to-end.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rip out ~3,000 lines of legacy multiset bus / aux trace code now that
`MidenLookupAuxBuilder` and `MidenLookupAir` are wired through
`prove_stark` and `ProcessorAir::eval`.

Air side:
- Delete `air/src/constraints/{decoder,stack,range,chiplets/bus*}.rs`
  — the per-component bus constraint files. Drop `pub mod bus`
  declarations from each parent module and the orphaned
  `air/src/constraints/bus.rs` index.
- Delete `enforce_bus`, `enforce_bus_first_row`, `enforce_bus_last_row`
  from `air/src/constraints/mod.rs`.
- Delete the orphaned `air/src/trace/aux_trace.rs` (an unused
  dependency-inversion trait that never had a real implementor).
- Collapse the legacy `AUX_TRACE_*_OFFSET` / `AUX_TRACE_WIDTH = 8`
  layout in `air/src/trace/mod.rs` down to
  `AUX_TRACE_WIDTH = LOGUP_AUX_TRACE_WIDTH (= 7)`. Strip dead column-index
  constants (`P1_COL_IDX`, `P2_COL_IDX`, `P3_COL_IDX`, `B_RANGE_COL_IDX`,
  hasher `P1_COL_IDX`) from the per-component trace modules.
- Delete the legacy `AuxCols<T>` struct + `NUM_AUX_COLS` + `AUX_COL_MAP`
  + `aux_col_map` test in `air/src/constraints/columns.rs`. Remove the
  unused `F_5` / `F_6` field constants left behind by the bus deletion.

Processor side:
- Delete the four legacy aux trace builders:
  `processor/src/trace/{decoder,stack,range,chiplets}/aux_trace*`
  (decoder block stack/hash/op group tables, stack overflow, range checker,
  chiplets hash kernel / bitwise / memory / ACE / virtual table / wiring bus)
- Delete `processor/src/debug.rs` (`BusMessage` trait + `BusDebugger`
  struct, only used by the deleted aux builders) and the dead
  `AuxColumnBuilder` trait in `processor/src/trace/utils.rs`.
- Delete `AuxTraceBuilders` struct, its `build_aux_columns` impl, and
  its `AuxBuilder` trait impl from `processor/src/trace/mod.rs`. Drop
  the field, accessor, and constructor parameter from `ExecutionTrace`.
  Delete the orphaned `ExecutionTrace::build_aux_trace` helper.
- Strip the dead `AceHints` / `EvaluatedCircuitsMetadata` infrastructure
  from `processor/src/trace/chiplets/ace/mod.rs` (the wiring-bus
  metadata + divisor-construction loop only existed for the legacy
  ACE wiring bus aux builder). The ACE chiplet `fill_trace` is now a
  pure trace filler with no side-channel return, and runs in the same
  parallel rayon scope as the other chiplets.
- Delete the legacy `processor/src/trace/tests/{decoder,stack,range,
  hasher,chiplets/}` aux trace builder tests. The cross-check oracle
  test in `processor/src/trace/tests/lookup.rs` stays — it remains the
  algebraic safety net for the LogUp collection pipeline.

Verified with `cargo check --workspace --all-targets`, the cross-check
oracle test, full miden-air + miden-processor lib suites (46 + 1466
passing), and `test_blake3_256_prove_verify` end-to-end.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sweep up the leftover dead code from the multiset deletion: a duplicate
`build_op_group` helper in the decoder test surface (shadowed by a
local copy in `core_trace_fragment/tests.rs`), the unused
`build_span_with_respan_ops` and `build_trace_from_ops_with_inputs`
test helpers, and a handful of imports left behind in `parallel/tests`,
`trace/utils`, and `trace/tests/mod.rs` after the legacy aux trace
tests were removed. The aux-trace-builder comparison block in
`parallel/tests::test_run_with_fragments_matches_single_fragment` is
also dropped since `ExecutionTrace::build_aux_trace` no longer exists
and the deterministic-trace snapshot still covers the same equality.

Workspace is now warning-clean apart from the pre-existing unused
`mstream` / `pipe` op-flag getter warning in `air/src/constraints/op_flags`,
which is unrelated to this milestone.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Start re-populating the `processor/src/trace/tests/` tree that commit
974cde5 cleared alongside the legacy multiset bus. Introduces a shared
`LookupHarness` that wraps `build_lookup_fractions` + `accumulate` with
per-row delta and terminal accessors, keyed by `aux_col::*` column
constants. Tests hand-construct expected `LookupMessage` instances via
the `logup_msg` structs and assert `aux[r+1][col] - aux[r][col]` matches
`-1 / msg.encode(&challenges)` at each request row — a strictly stronger
successor to the old `b_chip[r]` step-by-step walk.

`chiplets/bitwise.rs` is the representative restoration, covering
U32and/U32xor on `CHIPLET_REQUESTS` (M3). Layer 3 (column closure)
is intentionally deferred: the `diagnostic_multi_batch_terminals` test
in `tests/lookup.rs` documents that columns M_2+5, M3, and C1 carry
open boundary contributions until the LogUp boundary wiring lands.

Air-side: re-exports `LookupMessage` + the `logup_msg` module from the
previously-internal `constraints::lookup` / `constraints::logup_msg`
paths so the processor tests can encode messages without leaking other
internals. Test helpers `build_span_with_respan_ops` and
`build_trace_from_ops_with_inputs` are restored as well since the
follow-up decoder / hasher / memory / range / sibling-table
restorations depend on them.

A `TODO(logup-overflow)` note in `trace/tests/mod.rs` points at
`git show 974cde5^:processor/src/trace/tests/stack.rs` for the
deferred stack-overflow-table test.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds the two missing buses to MidenLookupAir, completing the migration
of legacy multiset arguments to LogUp:

- M5 / BUS_STACK_OVERFLOW_TABLE — three interactions on (clk, val, prev):
  right_shift add, left_shift∧overflow remove, dyncall∧overflow remove
  (DYNCALL uses hasher_state[5] for the new overflow pointer since b1'
  is reset by the call). Matches the legacy stack overflow bus exactly.

- BUS_HASHER_PERM_LINK on the shared C3 v_wiring column — binds hasher
  controller rows to permutation sub-chiplet rows via four interactions
  on (label, state[0..12]): controller input/output (+1, label 0/1),
  perm row 0 / row 15 (-m, label 0/1) where m is read from
  PermutationCols.multiplicity. Closes the structural gap that let the
  permutation segment run independently from the controller.

Sharing v_wiring with ACE wiring is sound because the two buses use
distinct bus_prefix[bus] additive bases — their fractions are linearly
independent in the extension field and cannot interfere on the shared
running accumulator. Renames emit_ace_wiring → emit_v_wiring and bumps
the column's MAX_INTERACTIONS_PER_ROW from 3 to 4.

ChipletActiveFlags gains a permutation field (= s_perm = perm_seg) so
the perm-link emitter can gate its perm-row interactions without
reaching past the chiplet active-flag abstraction.

LOGUP_AUX_TRACE_WIDTH grows 7 → 8 (one new M5 column; the perm-link
adds no new column thanks to the v_wiring share). MAIN_COLUMN_SHAPE
grows to 5 entries; MIDEN_COLUMN_SHAPE to 8.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Restore the LogUp boundary identity check on top of the Milestone B
stateless `MidenLookupAuxBuilder` integration, so `prove_verify` round-
trips close algebraically rather than by stub. Three threads of work:

**LogUp boundary constraints.** Re-enable the `when_first_row` and
`when_transition` assertions inside
`ConstraintColumn::column` (the transition was already active; the
first-row `acc == 0` is new). The symmetric `when_last_row` (binding
`aux[N-1]` to the committed final) stays commented out — it exceeds
the degree-9 budget on M_2+5 even though it mirrors the transition
shape; soundness of the verifier-side boundary check now relies on the
committed finals being honestly derived from the aux trace, to be
tightened in a follow-up.

**`ProcessorAir::reduced_aux_values`.** Restore the real boundary
formula: `sum = Σ aux_finals[col] + total_correction` where
`total_correction` cancels the three open buses:

- `c_block_hash = +1 / encode(BLOCK_HASH_TABLE, [0, ph[0..4], 0, 0])`
  — the decoder's END on the root block emits one unmatched `remove`.
- `c_log_precompile = 1/d_initial − 1/d_final` — the LOGPRECOMPILE
  transcript chain telescopes to one unmatched remove (initial state)
  and one unmatched add (final state).
- `c_kernel_rom = −Σ 1 / d_kernel_proc_msg_i over VLPI[0]` — the
  kernel ROM chiplet emits an `add` per kernel proc init, matched by
  the verifier via variable-length public inputs.

The four helper functions (`program_hash_message`,
`transcript_messages`, `kernel_proc_message`,
`kernel_logup_correction_from_var_len`) come out of their
`#[allow(dead_code)]` wrapper and get live callers again.

**Bus closure bug fixes** (found via parallel planning sessions that
added `BUSDBG` instrumentation to the prover and a Python groupby-d
analysis):

- Memory chiplet word-index range checks (`w0`, `w1`, `w1<<2` from
  `memory/mod.rs:291`) now have consumers on M4 via
  `block_stack_and_range_logcap.rs`, closing the M1↔M4 range_check
  gap. The file merges the legacy `block_stack.rs` + `range_logcap.rs`
  emitters into one so the range-check response and request sides live
  together.
- The RESPAN request in `chiplet_requests.rs` used `addr_next - 1` for
  the hasher parent; fixed to `addr_next` directly (matches the
  controller/perm split — the next row's decoder `addr` already points
  at the continuation input row).
- MSTREAM and PIPE chiplet bus requests were missing entirely; added
  two-word memory reads/writes and dropped the `TODO` / `dead_code`
  markers on `op_flags::{mstream,pipe}`.

**Width bump + test updates.** `LOGUP_AUX_TRACE_WIDTH` is 8 (5 main +
3 chiplet); test harnesses and shape assertions in
`processor/src/trace/tests/lookup.rs`, `lookup_harness.rs`, and the
bitwise chiplet bus test get the new value imported from `miden_air`.
The `aux_col` module in `lookup_harness.rs` collapses `HASHER_PERM_LINK`
into the shared `V_WIRING` slot (col 7), matching the 2b81a86c1 decision
to ride the perm-link bus on the ACE-wiring column via distinct
`bus_prefix` bases.

**Snapshot regeneration.** `RELATION_DIGEST` in `config.rs` and
`mod.masm`, plus the insta snapshot and `constraints_eval.masm`, are
regenerated to reflect the new constraint system
(`num_eval_gates: 5520`).

Verified on `test_blake3_256_prove_verify` end-to-end (debug print
shows `residual = [0, 0]`), the full non-recursive `prove_verify` suite
(10 / 10 passing across Blake3/Keccak/Rpo/Poseidon2/Rpx + both proptests
+ fast-parallel variants), the cross-check oracle, and
`miden-air --lib` (including the snapshot test and the degree-budget
test).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@adr1anh adr1anh changed the base branch from next to adr1anh/constraint-simplification April 15, 2026 03:23
adr1anh and others added 5 commits April 15, 2026 11:51
Two small CI fixups surfaced by the `check all feature combinations`
and `Build Documentation` jobs on `adr1anh/bus`:

- `air/src/constraints/lookup/miden_air.rs::SeededRng::next_felt` used
  `miden_crypto::rand::test_utils::prng_value`, which lives behind a
  feature flag not enabled under `cargo check --all-targets
  --no-default-features --manifest-path air/Cargo.toml`. Replaced the
  call with an inline SplitMix64 mixer on `(seed, counter)` — still
  reproducible across runs, still sufficient for the cached-encoding
  equivalence test, no external dep on miden-crypto's test-utils.

- `docs/src/design/bus_constraint_inventory.md` lines 993 and 1034 had
  bare-text formulas `deg(...) <= D_max` and
  `current_column_degree + cost(...) <= D_max`. Docusaurus' MDX parser
  sees the `<` and starts looking for a JSX tag, then chokes on the
  bare `=`. Wrapped both in inline backticks so MDX treats them as
  code and ignores the `<`.

Verified with `cargo check -p miden-air --all-targets --no-default-features`
(clean) and `cargo test -p miden-air --lib miden_lookup_air_cached_encoding_equivalence`
(passes with the new PRNG).
Move the four internal design docs authored during the all-LogUp bus
refactor out of `docs/src/design/` — they're working notes, not
user-facing documentation, and Docusaurus auto-indexes the folder so
they'd end up shipped on the site if left in place:

- `bus_api_design_notes.md` — closure-based `LookupBuilder` API
  rationale and trait-stack layout sketch.
- `bus_constraint_inventory.md` — per-bus constraint audit from the
  legacy multiset model, plus the column-packing degree-budget
  worksheet that drove the `MidenLookupAir` shape.
- `bus_packing_summary.md` — distilled packing decisions per column.
- `lookup_air_plan.md` — task-by-task execution log for the refactor.

The files live on locally under
`~/.claude/projects/-Users-adrian-Developer-miden-vm/memory/pr_bus_notes/`
for future reference; this commit only removes them from the tree.

Also drops the two dangling doc-comment references to
`docs/src/design/lookup_air_plan.md` in
`air/src/constraints/lookup/{mod,builder}.rs`.
Replace the Milestone-B empty stub in `reduced_aux_batch_config` with a
real LogUp boundary identity check in the ACE circuit DAG:

    0 = Σ aux_bound[0..7] + c_block_hash + c_log_precompile + c_kernel_rom

The two fixed-length corrections (block_hash, log_precompile) are rebuilt
inside the DAG as rational fractions `(±1, d)` and folded into a running
`(N, D)` pair — no in-circuit inversion needed. The kernel-ROM correction
is computed in MASM via `ext2inv + ext2add` accumulation and handed in as
a scalar via `VlpiReduction(0)`. The final boundary check is the quadratic
identity `(Σ aux_bound + c_kr) · D + N = 0`, batched with γ against the
constraint evaluation.

Key changes:

- **air/src/ace.rs**: new `LogUpBoundaryConfig` / `batch_logup_boundary` /
  `logup_boundary_config` replacing the old `ReducedAuxBatchConfig` stub.
  Reuses existing `encode_bus_message` for in-DAG message construction.

- **crates/ace-codegen/src/circuit.rs**: encoder now appends a trailing
  `add(root, 0)` when the DAG root isn't the last emitted op, fixing a
  Rust/MASM evaluator disagreement (the ACE chiplet reads "last wire" as
  output, not a logical root pointer).

- **crates/lib/core/asm/sys/vm/public_inputs.masm**: `reduce_kernel_digests`
  changed from multiplicative product (`ext2mul`) to sum-of-reciprocals
  (`ext2inv + ext2add + ext2neg`), matching `kernel_logup_correction_from_var_len`
  in the Rust verifier.

- Regenerated `constraints_eval.masm`, `RELATION_DIGEST`, and insta snapshot.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Merge the three encoding-primitive methods (beta_powers, bus_prefix,
insert_encoded) from the separate EncodedLookupGroup trait into
LookupGroup with panicking default implementations. This simplifies the
trait hierarchy by removing the EncodedGroup GAT from LookupColumn and
using a single Group type for both closures of
group_with_cached_encoding.

The prover path now unconditionally panics on encoding-primitive calls
(previously used debug_assert which was silent in release builds).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Base automatically changed from adr1anh/constraint-simplification to next April 16, 2026 15:51
Nashtare added a commit that referenced this pull request May 7, 2026
The block-hash bus boundary correction emits `Child { parent: 0,
child_hash: program_hash }` while the in-trace removal at the root END
row uses `End { parent, child_hash, is_first_child, is_loop_body }`.
Per the PR author's algebra, the two collapse to identical denominators
because at the root END row:

  - the next op is HALT, forcing addr_next = 0 ⇒ parent = addr_next = 0;
  - halt_next() = 1 ⇒ is_first_child = 1 - end_next - repeat_next - halt_next = 0;
  - the root block is not a loop body ⇒ is_loop_body = 0;
  - child_hash = h_0 = program_hash by the decoder's program-hash boundary.

With those four substitutions, `End` and `Child` produce the same
`(parent, child_hash, 0, 0)` payload tuple in `BlockHashMsg::encode`.

Document the algebra inline at the boundary emission site and add a
regression test that constructs both messages with the matching values
and asserts their encoded denominators are equal.

Refs: items.md A3, audit-report.md H-3 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nashtare added a commit that referenced this pull request May 7, 2026
The MASM recursive verifier hardcodes `gamma = beta^16` via 4 sequential
squarings in `crates/lib/core/asm/sys/vm/public_inputs.masm:239-251`. If
the Rust constant `MIDEN_MAX_MESSAGE_WIDTH` ever moves off 16, the MASM
needs a matching change (e.g. 5 squarings for beta^32) — but until now
nothing forced both to move together.

Add a `const _: () = assert!(MIDEN_MAX_MESSAGE_WIDTH == 16)` tripwire
that fails the build if the Rust side changes alone, with the MASM file
path called out in the message so the next person knows where to look.

Refs: items.md A20, audit-report.md L-5 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nashtare added a commit that referenced this pull request May 7, 2026
…ecks

Two single-line debug_assert additions flagged in the PR #2962 audit:

- `Challenges::new` already asserted max_message_width > 0 but not
  num_bus_ids > 0; a zero-length bus_prefix table would panic with an
  obscure index-out-of-bounds on first encode. Add the matching assert.

- `build_logup_aux_trace` reads challenges[0]/challenges[1] by index;
  a future short slice would panic with the same opaque message. Add a
  debug_assert with a clear message.

Both are cheap defensive checks. No behavior change for correct input.

Refs: items.md A6, A7 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nashtare added a commit that referenced this pull request May 7, 2026
Today only `BusId::HasherPermLinkOutput` (the high watermark) is pinned.
That catches gaps but not reorders: swapping two variants with adjacent
discriminants would silently swap which `bus_prefix[i]` each emitter
resolves to, breaking domain separation across every bus consumer.

Replace the single assertion with one `const _: () = assert!(...)` per
BusId variant, pinning the whole layout. Any future reorder now fails
at compile time at the line of the moved variant.

Refs: items.md A14, audit-report.md L-4 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nashtare added a commit that referenced this pull request May 7, 2026
…inants

The `bus: BusId` field on `MemoryMsg::Element` and `MemoryMsg::Word` is set
by the typed constructors `read_element` / `write_element` / `read_word` /
`write_word` to one of MemoryReadElement / MemoryWriteElement /
MemoryReadWord / MemoryWriteWord. Until now, external callers could
construct these variants directly with an arbitrary BusId — e.g.
`MemoryMsg::Word { bus: BusId::Bitwise, ... }` — silently breaking the
domain-separation guarantee.

Rust enum variants don't allow per-field visibility, so the standard fix
is `#[non_exhaustive]` on each variant: same-crate code can still match
and construct freely (preserving today's call sites), but external
callers are forced through the typed helpers.

Refs: items.md A4, audit-report.md M-6 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nashtare added a commit that referenced this pull request May 7, 2026
The prover path short-circuits on `flag == F::ZERO`, while the constraint
path evaluates the encode unconditionally. They agree only when
`flag ∈ {0, 1}`. Every Miden bus emitter today drives `flag` as a product
of decoder/op selectors that the AIR pins boolean, but the invariant is
implicit — a future emitter that introduces an arithmetic flag (e.g.
`s0 - s01` style subtractions producing a non-boolean intermediate) would
silently fork prover and constraint paths.

Add `debug_assert!(flag == F::ZERO || flag == F::ONE)` on both
ProverGroup::insert and ProverGroup::batch so any future regression is
caught at test time.

Refs: items.md A5, audit-report.md M-8 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nashtare added a commit that referenced this pull request May 7, 2026
The block-hash bus boundary correction emits `Child { parent: 0,
child_hash: program_hash }` while the in-trace removal at the root END
row uses `End { parent, child_hash, is_first_child, is_loop_body }`.
Per the PR author's algebra, the two collapse to identical denominators
because at the root END row:

  - the next op is HALT, forcing addr_next = 0 ⇒ parent = addr_next = 0;
  - halt_next() = 1 ⇒ is_first_child = 1 - end_next - repeat_next - halt_next = 0;
  - the root block is not a loop body ⇒ is_loop_body = 0;
  - child_hash = h_0 = program_hash by the decoder's program-hash boundary.

With those four substitutions, `End` and `Child` produce the same
`(parent, child_hash, 0, 0)` payload tuple in `BlockHashMsg::encode`.

Document the algebra inline at the boundary emission site and add a
regression test that constructs both messages with the matching values
and asserts their encoded denominators are equal.

Refs: items.md A3, audit-report.md H-3 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nashtare added a commit that referenced this pull request May 7, 2026
The MASM recursive verifier hardcodes `gamma = beta^16` via 4 sequential
squarings in `crates/lib/core/asm/sys/vm/public_inputs.masm:239-251`. If
the Rust constant `MIDEN_MAX_MESSAGE_WIDTH` ever moves off 16, the MASM
needs a matching change (e.g. 5 squarings for beta^32) — but until now
nothing forced both to move together.

Add a `const _: () = assert!(MIDEN_MAX_MESSAGE_WIDTH == 16)` tripwire
that fails the build if the Rust side changes alone, with the MASM file
path called out in the message so the next person knows where to look.

Refs: items.md A20, audit-report.md L-5 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nashtare added a commit that referenced this pull request May 7, 2026
…ecks

Two single-line debug_assert additions flagged in the PR #2962 audit:

- `Challenges::new` already asserted max_message_width > 0 but not
  num_bus_ids > 0; a zero-length bus_prefix table would panic with an
  obscure index-out-of-bounds on first encode. Add the matching assert.

- `build_logup_aux_trace` reads challenges[0]/challenges[1] by index;
  a future short slice would panic with the same opaque message. Add a
  debug_assert with a clear message.

Both are cheap defensive checks. No behavior change for correct input.

Refs: items.md A6, A7 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nashtare added a commit that referenced this pull request May 7, 2026
Today only `BusId::HasherPermLinkOutput` (the high watermark) is pinned.
That catches gaps but not reorders: swapping two variants with adjacent
discriminants would silently swap which `bus_prefix[i]` each emitter
resolves to, breaking domain separation across every bus consumer.

Replace the single assertion with one `const _: () = assert!(...)` per
BusId variant, pinning the whole layout. Any future reorder now fails
at compile time at the line of the moved variant.

Refs: items.md A14, audit-report.md L-4 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nashtare added a commit that referenced this pull request May 7, 2026
…inants

The `bus: BusId` field on `MemoryMsg::Element` and `MemoryMsg::Word` is set
by the typed constructors `read_element` / `write_element` / `read_word` /
`write_word` to one of MemoryReadElement / MemoryWriteElement /
MemoryReadWord / MemoryWriteWord. Until now, external callers could
construct these variants directly with an arbitrary BusId — e.g.
`MemoryMsg::Word { bus: BusId::Bitwise, ... }` — silently breaking the
domain-separation guarantee.

Rust enum variants don't allow per-field visibility, so the standard fix
is `#[non_exhaustive]` on each variant: same-crate code can still match
and construct freely (preserving today's call sites), but external
callers are forced through the typed helpers.

Refs: items.md A4, audit-report.md M-6 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nashtare added a commit that referenced this pull request May 7, 2026
The prover path short-circuits on `flag == F::ZERO`, while the constraint
path evaluates the encode unconditionally. They agree only when
`flag ∈ {0, 1}`. Every Miden bus emitter today drives `flag` as a product
of decoder/op selectors that the AIR pins boolean, but the invariant is
implicit — a future emitter that introduces an arithmetic flag (e.g.
`s0 - s01` style subtractions producing a non-boolean intermediate) would
silently fork prover and constraint paths.

Add `debug_assert!(flag == F::ZERO || flag == F::ONE)` on both
ProverGroup::insert and ProverGroup::batch so any future regression is
caught at test time.

Refs: items.md A5, audit-report.md M-8 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nashtare added a commit that referenced this pull request May 7, 2026
The block-hash bus boundary correction emits `Child { parent: 0,
child_hash: program_hash }` while the in-trace removal at the root END
row uses `End { parent, child_hash, is_first_child, is_loop_body }`.
Per the PR author's algebra, the two collapse to identical denominators
because at the root END row:

  - the next op is HALT, forcing addr_next = 0 ⇒ parent = addr_next = 0;
  - halt_next() = 1 ⇒ is_first_child = 1 - end_next - repeat_next - halt_next = 0;
  - the root block is not a loop body ⇒ is_loop_body = 0;
  - child_hash = h_0 = program_hash by the decoder's program-hash boundary.

With those four substitutions, `End` and `Child` produce the same
`(parent, child_hash, 0, 0)` payload tuple in `BlockHashMsg::encode`.

Document the algebra inline at the boundary emission site and add a
regression test that constructs both messages with the matching values
and asserts their encoded denominators are equal.

Refs: items.md A3, audit-report.md H-3 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nashtare added a commit that referenced this pull request May 7, 2026
The MASM recursive verifier hardcodes `gamma = beta^16` via 4 sequential
squarings in `crates/lib/core/asm/sys/vm/public_inputs.masm:239-251`. If
the Rust constant `MIDEN_MAX_MESSAGE_WIDTH` ever moves off 16, the MASM
needs a matching change (e.g. 5 squarings for beta^32) — but until now
nothing forced both to move together.

Add a `const _: () = assert!(MIDEN_MAX_MESSAGE_WIDTH == 16)` tripwire
that fails the build if the Rust side changes alone, with the MASM file
path called out in the message so the next person knows where to look.

Refs: items.md A20, audit-report.md L-5 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nashtare added a commit that referenced this pull request May 8, 2026
…ecks

Two single-line debug_assert additions flagged in the PR #2962 audit:

- `Challenges::new` already asserted max_message_width > 0 but not
  num_bus_ids > 0; a zero-length bus_prefix table would panic with an
  obscure index-out-of-bounds on first encode. Add the matching assert.

- `build_logup_aux_trace` reads challenges[0]/challenges[1] by index;
  a future short slice would panic with the same opaque message. Add a
  debug_assert with a clear message.

Both are cheap defensive checks. No behavior change for correct input.

Refs: items.md A6, A7 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nashtare added a commit that referenced this pull request May 8, 2026
Today only `BusId::HasherPermLinkOutput` (the high watermark) is pinned.
That catches gaps but not reorders: swapping two variants with adjacent
discriminants would silently swap which `bus_prefix[i]` each emitter
resolves to, breaking domain separation across every bus consumer.

Replace the single assertion with one `const _: () = assert!(...)` per
BusId variant, pinning the whole layout. Any future reorder now fails
at compile time at the line of the moved variant.

Refs: items.md A14, audit-report.md L-4 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nashtare added a commit that referenced this pull request May 8, 2026
…inants

The `bus: BusId` field on `MemoryMsg::Element` and `MemoryMsg::Word` is set
by the typed constructors `read_element` / `write_element` / `read_word` /
`write_word` to one of MemoryReadElement / MemoryWriteElement /
MemoryReadWord / MemoryWriteWord. Until now, external callers could
construct these variants directly with an arbitrary BusId — e.g.
`MemoryMsg::Word { bus: BusId::Bitwise, ... }` — silently breaking the
domain-separation guarantee.

Rust enum variants don't allow per-field visibility, so the standard fix
is `#[non_exhaustive]` on each variant: same-crate code can still match
and construct freely (preserving today's call sites), but external
callers are forced through the typed helpers.

Refs: items.md A4, audit-report.md M-6 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nashtare added a commit that referenced this pull request May 8, 2026
The prover path short-circuits on `flag == F::ZERO`, while the constraint
path evaluates the encode unconditionally. They agree only when
`flag ∈ {0, 1}`. Every Miden bus emitter today drives `flag` as a product
of decoder/op selectors that the AIR pins boolean, but the invariant is
implicit — a future emitter that introduces an arithmetic flag (e.g.
`s0 - s01` style subtractions producing a non-boolean intermediate) would
silently fork prover and constraint paths.

Add `debug_assert!(flag == F::ZERO || flag == F::ONE)` on both
ProverGroup::insert and ProverGroup::batch so any future regression is
caught at test time.

Refs: items.md A5, audit-report.md M-8 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nashtare added a commit that referenced this pull request May 8, 2026
The block-hash bus boundary correction emits `Child { parent: 0,
child_hash: program_hash }` while the in-trace removal at the root END
row uses `End { parent, child_hash, is_first_child, is_loop_body }`.
Per the PR author's algebra, the two collapse to identical denominators
because at the root END row:

  - the next op is HALT, forcing addr_next = 0 ⇒ parent = addr_next = 0;
  - halt_next() = 1 ⇒ is_first_child = 1 - end_next - repeat_next - halt_next = 0;
  - the root block is not a loop body ⇒ is_loop_body = 0;
  - child_hash = h_0 = program_hash by the decoder's program-hash boundary.

With those four substitutions, `End` and `Child` produce the same
`(parent, child_hash, 0, 0)` payload tuple in `BlockHashMsg::encode`.

Document the algebra inline at the boundary emission site and add a
regression test that constructs both messages with the matching values
and asserts their encoded denominators are equal.

Refs: items.md A3, audit-report.md H-3 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nashtare added a commit that referenced this pull request May 8, 2026
The MASM recursive verifier hardcodes `gamma = beta^16` via 4 sequential
squarings in `crates/lib/core/asm/sys/vm/public_inputs.masm:239-251`. If
the Rust constant `MIDEN_MAX_MESSAGE_WIDTH` ever moves off 16, the MASM
needs a matching change (e.g. 5 squarings for beta^32) — but until now
nothing forced both to move together.

Add a `const _: () = assert!(MIDEN_MAX_MESSAGE_WIDTH == 16)` tripwire
that fails the build if the Rust side changes alone, with the MASM file
path called out in the message so the next person knows where to look.

Refs: items.md A20, audit-report.md L-5 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nashtare added a commit that referenced this pull request May 8, 2026
…ecks

Two single-line debug_assert additions flagged in the PR #2962 audit:

- `Challenges::new` already asserted max_message_width > 0 but not
  num_bus_ids > 0; a zero-length bus_prefix table would panic with an
  obscure index-out-of-bounds on first encode. Add the matching assert.

- `build_logup_aux_trace` reads challenges[0]/challenges[1] by index;
  a future short slice would panic with the same opaque message. Add a
  debug_assert with a clear message.

Both are cheap defensive checks. No behavior change for correct input.

Refs: items.md A6, A7 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nashtare added a commit that referenced this pull request May 8, 2026
Today only `BusId::HasherPermLinkOutput` (the high watermark) is pinned.
That catches gaps but not reorders: swapping two variants with adjacent
discriminants would silently swap which `bus_prefix[i]` each emitter
resolves to, breaking domain separation across every bus consumer.

Replace the single assertion with one `const _: () = assert!(...)` per
BusId variant, pinning the whole layout. Any future reorder now fails
at compile time at the line of the moved variant.

Refs: items.md A14, audit-report.md L-4 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nashtare added a commit that referenced this pull request May 8, 2026
…inants

The `bus: BusId` field on `MemoryMsg::Element` and `MemoryMsg::Word` is set
by the typed constructors `read_element` / `write_element` / `read_word` /
`write_word` to one of MemoryReadElement / MemoryWriteElement /
MemoryReadWord / MemoryWriteWord. Until now, external callers could
construct these variants directly with an arbitrary BusId — e.g.
`MemoryMsg::Word { bus: BusId::Bitwise, ... }` — silently breaking the
domain-separation guarantee.

Rust enum variants don't allow per-field visibility, so the standard fix
is `#[non_exhaustive]` on each variant: same-crate code can still match
and construct freely (preserving today's call sites), but external
callers are forced through the typed helpers.

Refs: items.md A4, audit-report.md M-6 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nashtare added a commit that referenced this pull request May 8, 2026
The prover path short-circuits on `flag == F::ZERO`, while the constraint
path evaluates the encode unconditionally. They agree only when
`flag ∈ {0, 1}`. Every Miden bus emitter today drives `flag` as a product
of decoder/op selectors that the AIR pins boolean, but the invariant is
implicit — a future emitter that introduces an arithmetic flag (e.g.
`s0 - s01` style subtractions producing a non-boolean intermediate) would
silently fork prover and constraint paths.

Add `debug_assert!(flag == F::ZERO || flag == F::ONE)` on both
ProverGroup::insert and ProverGroup::batch so any future regression is
caught at test time.

Refs: items.md A5, audit-report.md M-8 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nashtare added a commit that referenced this pull request May 8, 2026
The block-hash bus boundary correction emits `Child { parent: 0,
child_hash: program_hash }` while the in-trace removal at the root END
row uses `End { parent, child_hash, is_first_child, is_loop_body }`.
Per the PR author's algebra, the two collapse to identical denominators
because at the root END row:

  - the next op is HALT, forcing addr_next = 0 ⇒ parent = addr_next = 0;
  - halt_next() = 1 ⇒ is_first_child = 1 - end_next - repeat_next - halt_next = 0;
  - the root block is not a loop body ⇒ is_loop_body = 0;
  - child_hash = h_0 = program_hash by the decoder's program-hash boundary.

With those four substitutions, `End` and `Child` produce the same
`(parent, child_hash, 0, 0)` payload tuple in `BlockHashMsg::encode`.

Document the algebra inline at the boundary emission site and add a
regression test that constructs both messages with the matching values
and asserts their encoded denominators are equal.

Refs: items.md A3, audit-report.md H-3 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nashtare added a commit that referenced this pull request May 8, 2026
The MASM recursive verifier hardcodes `gamma = beta^16` via 4 sequential
squarings in `crates/lib/core/asm/sys/vm/public_inputs.masm:239-251`. If
the Rust constant `MIDEN_MAX_MESSAGE_WIDTH` ever moves off 16, the MASM
needs a matching change (e.g. 5 squarings for beta^32) — but until now
nothing forced both to move together.

Add a `const _: () = assert!(MIDEN_MAX_MESSAGE_WIDTH == 16)` tripwire
that fails the build if the Rust side changes alone, with the MASM file
path called out in the message so the next person knows where to look.

Refs: items.md A20, audit-report.md L-5 (PR #2962 close-out hardening)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants