[mono-move] Add gas benchmarks by calintat · Pull Request #19470 · aptos-labs/aptos-core

calintat · 2026-04-16T14:21:12Z

Description

Add gas-instrumented variants to all micro-op benchmarks (fib, bst, merge_sort, nested_loop) and a new match_sum benchmark with a wide-diamond CFG shape. Each benchmark now runs both a plain and a gas-instrumented version, making it easy to measure gas metering overhead going forward.

How Has This Been Tested?

Key Areas to Review

Type of Change

Which Components or Systems Does This Change Impact?

Checklist

I have read and followed the CONTRIBUTING doc
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I identified and added all stakeholders and component owners affected by this change as reviewers
I tested both happy and unhappy path of the functionality
I have made corresponding changes to the documentation

Note

Low Risk
Low risk: changes are limited to benchmarking/test utilities and a new synthetic program, with minimal impact on runtime behavior outside benches.

Overview
Adds gas-measurement coverage to the micro-op benchmarks by running each benchmark in two modes: a plain execution using the new NoOpGasMeter, and a gas-instrumented execution that replays the same program after GasInstrumentor has inserted Charge ops.

Introduces shared bench helper gas_instrument to clone and instrument micro-op Function tables into a fresh arena, and adds a new match_sum synthetic program + Criterion bench (including correctness tests) designed with a wide-diamond CFG to stress basic-block boundary instrumentation. Also makes MicroOp Copy/Clone to simplify handling in instrumentation/bench code.

^{Reviewed by Cursor Bugbot for commit 54cf9d7. Bugbot is set up for automated code reviews on this repo. Configure here.}

cursor

Differential Security Review — [mono-move] Add gas benchmarks (PR #19470)

Date: 2026-04-16
Scope: 48c6902d0bb1fc2b04d5a89da07e1a3d2942698b...2bd66122499f5fc664346f13a353a94fa863d7c8
Reviewer: Automated differential review

Executive Summary

Severity	Count
CRITICAL	0
HIGH	0
MEDIUM	0
LOW	2

Overall risk: Low. The change is scoped entirely to benchmark infrastructure and test programs inside third_party/move/mono-move/programs/; no production execution path, no consensus, no VM dispatch logic, and no storage are touched.

Key metrics: 12 files changed (12 Rust, 0 Move), 1 module touched (mono-move/programs + mono-move/gas), 2 findings.

Recommendation: APPROVE WITH NOTES

What Changed

Files changed: 12 (all Rust) | Lines: +545 / -8

Module	Files Changed	Risk Level
`mono-move/gas/src/lib.rs`	1	Low
`mono-move/core/src/instruction/mod.rs`	1	Low
`mono-move/programs/benches/`	5 (3 modified, 2 new)	Low
`mono-move/programs/src/`	2 (1 modified, 1 new)	Low
`mono-move/programs/tests/`	1 (new)	Low
`mono-move/programs/Cargo.toml`	1	Trivial

Findings

[LOW] Finding 1 — `NoOpGasMeter` exported from production crate without a feature or `#[cfg]` guard

File: third_party/move/mono-move/gas/src/lib.rs:82–92
Test coverage: Not tested (intended as bench/test helper)

Description: NoOpGasMeter is defined as a top-level pub struct in mono_move_gas::lib, the same crate and visibility level as SimpleGasMeter. Its doc comment says "for testing," but there is no #[cfg(any(test, feature = "testing"))] gate, no #[doc(hidden)], and no module-level barrier preventing its use in production execution paths.

InterpreterContext in the runtime is generic over G: GasMeter. Any future code that passes NoOpGasMeter where a real meter is expected would silently bypass all gas enforcement, with balance() always returning u64::MAX.

Concrete impact: No current exploit — the type is only consumed by bench binaries in this PR. The risk is that this API footgun grows into a future misuse as the codebase matures.

Why here: The type was introduced specifically to support the micro_op (plain, no-instrumentation) benchmark variants. A #[cfg(test)] guard or placement inside a testing feature gate would match the pattern already used elsewhere (e.g. #[cfg(feature = "testing")] in programs/src/lib.rs).

[LOW] Finding 2 — `gas_instrument` bench helper silently drops GC root metadata

File: third_party/move/mono-move/programs/benches/helpers.rs:228–238
Test coverage: Bench-only, not part of any test target

Description: gas_instrument builds gas-instrumented copies of functions but replaces both frame_layout and safe_point_layouts with empty stubs:

frame_layout: FrameLayoutInfo::empty(&arena),
safe_point_layouts: SortedSafePointEntries::empty(&arena),

The function's doc comment acknowledges this: "Frame layouts are re-created as empty; these benchmark programs do not trigger GC, so the omission has no effect on execution." That is correct for the four benchmark programs today (all scalar-only, no heap pointer slots).

However, the function signature accepts any &[Option<ExecutableArenaPtr<Function>>] with no type-level or debug_assert! enforcement of the "no heap pointer locals" precondition. If a future program with GC-managed pointer slots is passed to this helper, the GC would fail to scan pointer-holding frame slots, causing silent corruption. There is no #[cfg(test)] or #[cfg(feature = "testing")] guard on the function itself.

Concrete impact: No current exploit — all four programs passed to gas_instrument today are scalar-only. The risk is a latent copy-paste footgun for any future benchmark program that allocates heap objects.

Test Coverage Analysis

Changed Function / Path	Coverage	Notes
`NoOpGasMeter`	Bench-only	No unit test asserting the no-op contract
`gas_instrument` helper	Bench-only	No test asserting correctness of instrumented output
`micro_op_match_sum` + bench	Well-tested	`tests/match_sum.rs` covers `native`, `micro_op`, `move_bytecode`
`MicroOp: Copy + Clone` derive	Implicit	Exercised by `raw.to_vec()` in `gas_instrument`

Blast Radius

All changed functions are confined to benchmark and test code; none are callable from the production execution pipeline. The NoOpGasMeter type is the only change reachable from outside test/bench contexts (as a pub export), but no production code currently imports it.

Correctness Notes (Not Findings)

Arena lifetime in match_sum / nested_loop gas setup: let (functions, _, _arena) = micro_op_match_sum() followed by let (functions_gas, _arena) = unsafe { helpers::gas_instrument(&wrapped) } shadows the original _arena. This is correct: Rust evaluates the right-hand side before the shadow takes effect, so the original arena outlives the gas_instrument call. The dangling functions/wrapped pointers after shadowing are never used in any closure.
Hardcoded functions[6] in bst.rs: Pre-existing pattern; the comment // Function 6 — run_ops in src/bst.rs:315 confirms the index is stable. Not introduced by this PR.
Missing Function::resolve_calls for match_sum / nested_loop gas variants: Correct omission — these programs contain no CallFunc ops (no inter-function calls), so no patching is needed.
descriptors reuse across bst.rs bench setups: The gas benchmark closure captures descriptors from the first micro_op_bst() call while using functions_gas from the second. This is safe because ObjectDescriptor is plain data with no arena pointers.

Recommendations

Before Production Use

Gate NoOpGasMeter behind #[cfg(any(test, feature = "testing"))] or move it to a dedicated test-helper module — prevents accidental use in execution paths as the codebase grows.
Add a debug_assert! to gas_instrument that all input functions have empty frame_layout (or document the precondition more explicitly), to catch future benchmark programs with heap pointer locals before they produce GC bugs.

_{Sent by Cursor Automation: Security Review Bot}

cursor · 2026-04-16T14:43:53Z

 }

+/// A no-op gas meter for testing.
+pub struct NoOpGasMeter;


[LOW] NoOpGasMeter is exported as a top-level pub struct with no #[cfg(test)] or feature gate. Since InterpreterContext is generic over G: GasMeter, this type is silently usable in any execution context. The doc comment says "for testing" but there is no compile-time enforcement. Consider #[cfg(any(test, feature = "testing"))], consistent with the pattern already used in programs/src/lib.rs.

cursor · 2026-04-16T14:43:53Z

+        })
+        .collect();
+    (new_fns, arena)
+}


[LOW] frame_layout and safe_point_layouts are silently dropped (replaced with empty stubs). The doc comment explains this is safe for programs that don't trigger GC, but the function accepts any input without enforcing that precondition. If a future benchmark program has heap-pointer locals, the GC will silently miss those roots. A debug_assert! on func.frame_layout.heap_ptr_offsets being empty, or a note in the # Safety section, would make the precondition explicit.

github-actions · 2026-04-21T23:17:04Z

✅ Forge suite `compat` success on `ca049383dd80675149ef2d0042668964f9f9107a` ==> `54cf9d73b4701bd54beecd529abf8022309f80fc`

Compatibility test results for ca049383dd80675149ef2d0042668964f9f9107a ==> 54cf9d73b4701bd54beecd529abf8022309f80fc (PR)
1. Check liveness of validators at old version: ca049383dd80675149ef2d0042668964f9f9107a
compatibility::simple-validator-upgrade::liveness-check : committed: 14215.43 txn/s, latency: 2417.81 ms, (p50: 2400 ms, p70: 2700, p90: 3100 ms, p99: 3500 ms), latency samples: 473320
2. Upgrading first Validator to new version: 54cf9d73b4701bd54beecd529abf8022309f80fc
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 6235.94 txn/s, latency: 5423.42 ms, (p50: 5900 ms, p70: 6000, p90: 6100 ms, p99: 6200 ms), latency samples: 218740
3. Upgrading rest of first batch to new version: 54cf9d73b4701bd54beecd529abf8022309f80fc
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 6354.84 txn/s, latency: 5359.96 ms, (p50: 5900 ms, p70: 6000, p90: 6100 ms, p99: 6200 ms), latency samples: 220680
4. upgrading second batch to new version: 54cf9d73b4701bd54beecd529abf8022309f80fc
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 11053.36 txn/s, latency: 2928.14 ms, (p50: 3100 ms, p70: 3200, p90: 3400 ms, p99: 3500 ms), latency samples: 363700
5. check swarm health
Compatibility test for ca049383dd80675149ef2d0042668964f9f9107a ==> 54cf9d73b4701bd54beecd529abf8022309f80fc passed
Test Ok

github-actions · 2026-04-21T23:19:07Z

✅ Forge suite `realistic_env_max_load` success on `54cf9d73b4701bd54beecd529abf8022309f80fc`

Forge report malformed: Expecting property name enclosed in double quotes: line 2 column 1 (char 2)
'{\n[2026-04-21T23:19:03Z INFO  aptos_forge::report] Test Ok\n  "metrics": [\n    {\n      "test_name": "two traffics test: inner traffic",\n      "metric": "submitted_txn",\n      "value": 5933420.0\n    },\n    {\n      "test_name": "two traffics test: inner traffic",\n      "metric": "expired_txn",\n      "value": 0.0\n    },\n    {\n      "test_name": "two traffics test: inner traffic",\n      "metric": "avg_tps",\n      "value": 15886.23278806848\n    },\n    {\n      "test_name": "two traffics test: inner traffic",\n      "metric": "avg_latency",\n      "value": 1083.840425926363\n    },\n    {\n      "test_name": "two traffics test: inner traffic",\n      "metric": "p50_latency",\n      "value": 1000.0\n    },\n    {\n      "test_name": "two traffics test: inner traffic",\n      "metric": "p90_latency",\n      "value": 1200.0\n    },\n    {\n      "test_name": "two traffics test: inner traffic",\n      "metric": "p99_latency",\n      "value": 1600.0\n    },\n    {\n      "test_name": "two traffics test",\n      "metric": "submitted_txn",\n      "value": 42600.0\n    },\n    {\n      "test_name": "two traffics test",\n      "metric": "expired_txn",\n      "value": 0.0\n    },\n    {\n      "test_name": "two traffics test",\n      "metric": "avg_tps",\n      "value": 99.98399980554233\n    },\n    {\n      "test_name": "two traffics test",\n      "metric": "avg_latency",\n      "value": 836.9186046511628\n    },\n    {\n      "test_name": "two traffics test",\n      "metric": "p50_latency",\n      "value": 800.0\n    },\n    {\n      "test_name": "two traffics test",\n      "metric": "p90_latency",\n      "value": 1000.0\n    },\n    {\n      "test_name": "two traffics test",\n      "metric": "p99_latency",\n      "value": 1100.0\n    }\n  ],\n  "text": "two traffics test: inner traffic : committed: 15886.23 txn/s, latency: 1083.84 ms, (p50: 1000 ms, p70: 1100, p90: 1200 ms, p99: 1600 ms), latency samples: 5933420\\ntwo traffics test : committed: 99.98 txn/s, latency: 836.92 ms, (p50: 800 ms, p70: 900, p90: 1000 ms, p99: 1100 ms), latency samples: 1720\\nLatency breakdown for phase 0: [\\"MempoolToBlockCreation: max: 0.278, avg: 0.260\\", \\"ConsensusProposalToOrdered: max: 0.118, avg: 0.114\\", \\"ConsensusOrderedToCommit: max: 0.204, avg: 0.175\\", \\"ConsensusProposalToCommit: max: 0.315, avg: 0.289\\"]\\nMax non-epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.46s no progress at version 6012009 (avg 0.06s) [limit 15].\\nMax epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.33s no progress at version 2808488 (avg 0.33s) [limit 16].\\nTest Ok"\n}'
Trailing Log Lines:
[2026-04-21T23:18:58Z INFO  ureq::unit] sending request POST http://vmagent-victoria-metrics-agent.victoria-metrics.svc:8429/api/v1/import/prometheus
test CompositeNetworkTest ... ok
Test Statistics: 
two traffics test: inner traffic : committed: 15886.23 txn/s, latency: 1083.84 ms, (p50: 1000 ms, p70: 1100, p90: 1200 ms, p99: 1600 ms), latency samples: 5933420
two traffics test : committed: 99.98 txn/s, latency: 836.92 ms, (p50: 800 ms, p70: 900, p90: 1000 ms, p99: 1100 ms), latency samples: 1720
Latency breakdown for phase 0: ["MempoolToBlockCreation: max: 0.278, avg: 0.260", "ConsensusProposalToOrdered: max: 0.118, avg: 0.114", "ConsensusOrderedToCommit: max: 0.204, avg: 0.175", "ConsensusProposalToCommit: max: 0.315, avg: 0.289"]
Max non-epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.46s no progress at version 6012009 (avg 0.06s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.33s no progress at version 2808488 (avg 0.33s) [limit 16].
Test Ok

=== BEGIN JUNIT ===
<?xml version="1.0" encoding="UTF-8"?>
<testsuites name="forge" tests="1" failures="0" errors="0" uuid="b27fea43-186b-4bd6-8533-d5acee2fb3d4">
    <testsuite name="local" tests="1" disabled="0" errors="0" failures="0">
        <testcase name="CompositeNetworkTest(network:multi-region-network-emulation(two traffics test)) with ">
        </testcase>
    </testsuite>
</testsuites>
=== END JUNIT ===
[2026-04-21T23:19:03Z INFO  aptos_forge::backend::k8s::cluster_helper] Deleting namespace forge-e2e-pr-19470: Some(NamespaceStatus { conditions: None, phase: Some("Terminating") })
[2026-04-21T23:19:03Z INFO  aptos_forge::backend::k8s::cluster_helper] aptos-node resources for Forge removed in namespace: forge-e2e-pr-19470
[2026-04-21T23:19:03Z INFO  ureq::unit] sending request POST http://vmagent-victoria-metrics-agent.victoria-metrics.svc:8429/api/v1/import/prometheus

test result: ok. 1 passed; 0 soft failed; 0 hard failed; 0 filtered out

Debugging output:
NAME                                         READY   STATUS      RESTARTS   AGE
aptos-node-0-validator-0                     1/1     Running     0          12m
aptos-node-1-validator-0                     1/1     Running     0          12m
aptos-node-2-validator-0                     1/1     Running     0          12m
aptos-node-3-validator-0                     1/1     Running     0          12m
aptos-node-4-validator-0                     1/1     Running     0          12m
aptos-node-5-validator-0                     1/1     Running     0          12m
aptos-node-6-validator-0                     1/1     Running     0          12m
forge-pfn-deployer-bsgx7                     0/1     Completed   0          13m
forge-testnet-deployer-fpbx8                 0/1     Completed   0          13m
genesis-aptos-genesis-eforge4af72268-pmhmb   0/1     Completed   0          12m
pfn-0-0                                      1/1     Running     0          12m
pfn-1-0                                      1/1     Running     0          12m
pfn-2-0                                      1/1     Running     0          12m

calintat marked this pull request as ready for review April 16, 2026 14:29

cursor Bot reviewed Apr 16, 2026

View reviewed changes

calintat requested review from georgemitenkov, vgao1996 and vineethk April 20, 2026 17:01

vgao1996 approved these changes Apr 20, 2026

View reviewed changes

georgemitenkov approved these changes Apr 21, 2026

View reviewed changes

[mono-move] Add gas benchmarks

54cf9d7

calintat enabled auto-merge (squash) April 21, 2026 16:43

calintat force-pushed the calin/gas-benchmarks branch from 2bd6612 to 54cf9d7 Compare April 21, 2026 16:43

This comment has been minimized.

Sign in to view

calintat merged commit e48def4 into main Apr 21, 2026
74 of 76 checks passed

calintat deleted the calin/gas-benchmarks branch April 21, 2026 23:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[mono-move] Add gas benchmarks#19470

[mono-move] Add gas benchmarks#19470
calintat merged 1 commit intomainfrom
calin/gas-benchmarks

calintat commented Apr 16, 2026 •

edited by cursor Bot

Loading

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot Apr 16, 2026

Uh oh!

cursor Bot Apr 16, 2026

Uh oh!

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

github-actions Bot commented Apr 21, 2026

Uh oh!

github-actions Bot commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

calintat commented Apr 16, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

How Has This Been Tested?

Key Areas to Review

Type of Change

Which Components or Systems Does This Change Impact?

Checklist

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Differential Security Review — [mono-move] Add gas benchmarks (PR #19470)

Executive Summary

What Changed

Findings

[LOW] Finding 1 — NoOpGasMeter exported from production crate without a feature or #[cfg] guard

[LOW] Finding 2 — gas_instrument bench helper silently drops GC root metadata

Test Coverage Analysis

Blast Radius

Correctness Notes (Not Findings)

Recommendations

Before Production Use

Uh oh!

cursor Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

github-actions Bot commented Apr 21, 2026

✅ Forge suite compat success on ca049383dd80675149ef2d0042668964f9f9107a ==> 54cf9d73b4701bd54beecd529abf8022309f80fc

Uh oh!

github-actions Bot commented Apr 21, 2026

✅ Forge suite realistic_env_max_load success on 54cf9d73b4701bd54beecd529abf8022309f80fc

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

calintat commented Apr 16, 2026 •

edited by cursor Bot

Loading

[LOW] Finding 1 — `NoOpGasMeter` exported from production crate without a feature or `#[cfg]` guard

[LOW] Finding 2 — `gas_instrument` bench helper silently drops GC root metadata

✅ Forge suite `compat` success on `ca049383dd80675149ef2d0042668964f9f9107a` ==> `54cf9d73b4701bd54beecd529abf8022309f80fc`

✅ Forge suite `realistic_env_max_load` success on `54cf9d73b4701bd54beecd529abf8022309f80fc`