Skip to content

test(sae): Address legacypool race#5292

Merged
StephenButtolph merged 9 commits intomasterfrom
alarso16/legacypool-race-tests
Apr 27, 2026
Merged

test(sae): Address legacypool race#5292
StephenButtolph merged 9 commits intomasterfrom
alarso16/legacypool-race-tests

Conversation

@alarso16
Copy link
Copy Markdown
Contributor

@alarso16 alarso16 commented Apr 16, 2026

Why this should be merged

There is a bug in the legacypool that's not worth fixing. Specifically, at the end of block execution, an event is sent to the TxPool to indicate a reorg. This is expected to update nonces so that transactions already in the mempool are either evicted or promoted using this new data. However, the internal nonce tracking happens after the promotions, so if there is already a pending tx for some account in the pool that is still executable, a dependent queued transaction will be incorrectly gapped. If the pending tx is ever repriced or executed (and maybe some other cases), the queued transaction can later be moved to pending.

This is a rather complicated problem, and still exists upstream in go-ethereum. There is no simple way to inject a couple lines in libevm to address this, so we should probably just deal with it here. Most importantly, this doesn't significantly affect any production processes, since the transaction will just be included in a later block. It only affects testing, because we want to rely on adding transactions to a block.

How this works

This race requires a few conditions:

  1. A block is currently executing (i.e. the NewHead event could come anytime)
  2. A transaction for an address is marked as pending
  3. A transaction for the same address is marked as queued, but a reorg is scheduled

This is pretty specific, so there's several ways that we can in general fix this. I only know of two flakes of this manner, fixed in this PR. The first flake, TestWorstCase, relies on execution in the background, so we must just mark it as flaky. The second flake is easily avoided by not issuing transactions while a block is executing.

I manually reviewed all tests in sae/saevm, and if there were any blocks that expected multiple transactions from a single account, I ensured that they are either listed above, or are the first block created (aka race avoided)

How this was tested

Lots and lots of debugging... if the flakes come back, I'm happy to further investigate

Need to be documented in RELEASES.md?

No

@alarso16 alarso16 force-pushed the alarso16/legacypool-race-tests branch from 2d03433 to 194ad23 Compare April 16, 2026 14:50
@alarso16 alarso16 self-assigned this Apr 16, 2026
@alarso16 alarso16 added testing This primarily focuses on testing evm Related to EVM functionality flaky-test Test failing occasionally labels Apr 16, 2026
@alarso16 alarso16 removed this from avalanchego Apr 16, 2026
Comment thread vms/saevm/sae/worstcase_test.go
@alarso16 alarso16 marked this pull request as ready for review April 16, 2026 15:41
Copilot AI review requested due to automatic review settings April 16, 2026 15:41
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces flakes in SAE VM tests caused by a known race in libevm’s legacy txpool nonce tracking during concurrent block execution/reorg signaling.

Changes:

  • Mark TestWorstCase as explicitly flaky (opt-in via env var) and move it into a Bazel flaky-only test target.
  • Ensure receipt-related tests wait for block execution before asserting/using execution-dependent state.
  • Document the “pending tx” race caveat on helper methods that wait for mempool pending state.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
vms/saevm/sae/worstcase_test.go Makes TestWorstCase opt-in flaky and removes CLI-flag plumbing in favor of fixed parameters.
vms/saevm/sae/vm_test.go Removes flag usage from TestMain; adds warnings about pending-state wait helpers under concurrent execution.
vms/saevm/sae/rpc_test.go Waits for blocks to finish executing before continuing receipt-based assertions.
vms/saevm/sae/accept_block_test.go Unifies flaky gating under SAEVM_TEST_FLAKY.
vms/saevm/sae/BUILD.bazel Moves flaky tests into a dedicated flaky_test Bazel target and excludes them from the main test target.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread vms/saevm/sae/rpc_test.go Outdated
@alarso16 alarso16 force-pushed the alarso16/legacypool-race-tests branch 3 times, most recently from 8e69e95 to 8b20881 Compare April 16, 2026 16:28
@alarso16 alarso16 force-pushed the alarso16/legacypool-race-tests branch from 8b20881 to fb6db1f Compare April 16, 2026 16:29
Copy link
Copy Markdown
Contributor

@JonathanOppenheimer JonathanOppenheimer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This look pretty good to me. You clearly did your research.

Comment thread vms/saevm/sae/rpc_test.go
Comment thread vms/saevm/sae/worstcase_test.go
Comment thread vms/saevm/sae/BUILD.bazel Outdated
Copy link
Copy Markdown
Contributor Author

@alarso16 alarso16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Symbolic approval of Stephen's edits

Copy link
Copy Markdown
Contributor

@JonathanOppenheimer JonathanOppenheimer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restoring the fuzz testing is good IMO.

@StephenButtolph StephenButtolph added this pull request to the merge queue Apr 27, 2026
Merged via the queue into master with commit d005d8c Apr 27, 2026
60 checks passed
@StephenButtolph StephenButtolph deleted the alarso16/legacypool-race-tests branch April 27, 2026 22:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

evm Related to EVM functionality flaky-test Test failing occasionally testing This primarily focuses on testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants