Fix problematic transaction handling by brice-stacks · Pull Request #7183 · stacks-network/stacks-core

brice-stacks · 2026-05-04T15:51:30Z

This branch does two things:

When a transaction hits the execution time limit, mark it as problematic
When a transaction is problematic, revert the cost accumulation from that transaction

coveralls · 2026-05-04T16:44:19Z

Coverage Report for CI Build 25801484257

Coverage decreased (-0.06%) to 85.648%

Details

Coverage decreased (-0.06%) from the base build.
Patch coverage: 20 uncovered changes across 4 files (110 of 130 lines covered, 84.62%).
5464 coverage regressions across 98 files.

Uncovered Changes

File	Changed	Covered	%
stackslib/src/chainstate/stacks/miner.rs	14	7	50.0%
stackslib/src/net/api/postblock_proposal.rs	14	7	50.0%
clarity/src/vm/clarity.rs	5	1	20.0%
stackslib/src/chainstate/stacks/mod.rs	3	1	33.33%

Coverage Regressions

5464 previously-covered lines in 98 files lost coverage.

Top 10 Files by Coverage Loss	Lines Losing Coverage	Coverage
stackslib/src/config/mod.rs	358	69.09%
stackslib/src/chainstate/stacks/index/storage.rs	262	80.55%
stackslib/src/chainstate/stacks/miner.rs	238	83.06%
stackslib/src/net/inv/epoch2x.rs	221	79.44%
stackslib/src/chainstate/stacks/db/transactions.rs	203	97.16%
stackslib/src/net/chat.rs	201	93.01%
stackslib/src/chainstate/stacks/db/mod.rs	196	86.23%
stackslib/src/chainstate/stacks/index/trie_sql.rs	190	69.7%
stackslib/src/core/mempool.rs	170	86.87%
stackslib/src/chainstate/stacks/index/node.rs	161	87.28%

Coverage Stats


Relevant Lines:	219809
Covered Lines:	188263
Line Coverage:	85.65%
Coverage Strength:	18940648.33 hits per line

💛 - Coveralls

benjamin-stacks

Based on my reading of the code, I think there's something problematic with this change (pun not intended but taken anyway), unless I've missed something.

When the miner selects from the mempool, it applies a 30s execution time budget per transaction:

stacks-core/stackslib/src/chainstate/stacks/miner.rs

Line 2689 in 53ffba0

settings.max_execution_time,

When the signer's node validates a block proposal, it applies a 60s execution time budget for the whole block:

stacks-core/stackslib/src/net/api/postblock_proposal.rs

Line 725 in 53ffba0

let block_deadline = Instant::now() + Duration::from_secs(timeout_secs);

Three scenarios:

1) The miner is using its default setting of max 5 seconds per mempool sweep. In that case, it seems your budget-reset change seems to be no-op, because if we've blown past 30 seconds, we've certainly blown past 5.

2) The miner has a high enough max_miner_time_ms for the budget reset to matter, but it's below the signer's block time budget. In this case, I think we're fine (although a miner second might not be a signer second, so the line is blurry).

3) The max_miner_time_ms is even higher. Let's say it's 70 seconds. In that case, the miner might create a block with these transactions:

Tx 1: 29 seconds
Tx 2: 29 seconds
Tx 3: 3 seconds

It would then propose the block to the signers, which would identify Tx 3 as problematic and, via #6964, the miner will drop Tx 3 even though it's the only nice transaction in the bunch.

If I'm right, we might still say that that's fine, I just want us to be explicit about it. If I'm wrong, at least I'll learn something 😅

benjamin-stacks · 2026-05-05T11:28:58Z

 pub fn convert_clarity_error_to_transaction_result(
    clarity_tx: &mut ClarityTx,
    tx: &StacksTransaction,
+    cost_before: &ExecutionCost,
    error: Error,


I really don't like the fact that a function called convert_x_to_y modifies global(-ish) execution state (namely the clarity_tx). Granted, you didn't add this problem, it already existed before your change.

What you did add though was a second way to pass in the cost_before to which to reset this state. In case of CostOverflowError, that information is part of the error; in your case, it's passed in as an argument to the function.

Neither of the two are particularly pretty (although I like yours a little better, under the premise that we rename the function to make clearer what it does), but we really shouldn't have both at the same time.

Hmm, yeah this is a good point. Let me see if I can come up with a better refactoring.

I renamed it to finalize_failed_transaction in ffc2334

brice-stacks · 2026-05-05T14:57:31Z

The miner is using its default setting of max 5 seconds per mempool sweep. In that case, it seems your budget-reset change seems to be no-op, because if we've blown past 30 seconds, we've certainly blown past 5.

The 5 second mempool sweep is only checked in between transactions, while the 30 second limit is checked during execution, so it can interrupt the processing of a single transaction.

The miner has a high enough max_miner_time_ms for the budget reset to matter, but it's below the signer's block time budget. In this case, I think we're fine (although a miner second might not be a signer second, so the line is blurry).

This is a kind of implicit dance between the miner and signers. The wall-clock time limit is not a part of consensus, though it should be approximately enforced by the cost limits. A miner could potentially mine a valid block that takes 10 minutes to process if there was an issue in the cost tracking, but the signers would reject it because of their own wall-clock time limits. I think this is reasonable.

The max_miner_time_ms is even higher. Let's say it's 70 seconds. In that case, the miner might create a block with these transactions:

Tx 1: 29 seconds

Tx 2: 29 seconds

Tx 3: 3 seconds

It would then propose the block to the signers, which would identify Tx 3 as problematic and, via #6964, the miner will drop Tx 3 even though it's the only nice transaction in the bunch.

A transaction should only be marked as problematic if the time for that individual transaction exceeds the limit, not the total time spent on the block. The path for feedback from the signer might be different. I'll take a closer look at that again now.

brice-stacks · 2026-05-05T15:20:09Z

Yeah, so you're right. The signer will mark the transaction as problematic if it just happened to be the one executing while the deadline was reached. I'm thinking we should change that signer behavior, so that a transaction is only marked problematic if it alone exceeds a timeout.

brice-stacks · 2026-05-05T16:54:56Z

miner_rejection_by_contract_publish_execution_time_expired is a legit failure, not flakiness. I'll update this test as needed.

brice-stacks · 2026-05-05T17:56:51Z

I changed the signer behavior (and fixed the integration test) in f928a96.

brice-stacks added 2 commits May 4, 2026 11:48

fix: transactions that timeout are marked as problematic

29dc462

fix: revert costs for problematic transactions

b8fd439

brice-stacks requested review from aaronb-stacks and benjamin-stacks May 4, 2026 15:51

benjamin-stacks reviewed May 5, 2026

View reviewed changes

feat: signer marks tx exceeding timeout as problematic

f928a96

refactor: clarify finalize_failed_transaction

ffc2334

brice-stacks requested a review from benjamin-stacks May 5, 2026 18:37

Merge branch 'develop' into fix/problematic-txs

420cb59

aaronb-stacks approved these changes May 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix problematic transaction handling#7183

Fix problematic transaction handling#7183
brice-stacks wants to merge 5 commits into
stacks-network:developfrom
brice-stacks:fix/problematic-txs

brice-stacks commented May 4, 2026

Uh oh!

coveralls commented May 4, 2026 •

edited

Loading

Uh oh!

benjamin-stacks left a comment •

edited

Loading

Uh oh!

benjamin-stacks May 5, 2026

Uh oh!

brice-stacks May 5, 2026

Uh oh!

brice-stacks May 5, 2026

Uh oh!

brice-stacks commented May 5, 2026

Uh oh!

brice-stacks commented May 5, 2026

Uh oh!

brice-stacks commented May 5, 2026

Uh oh!

brice-stacks commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

brice-stacks commented May 4, 2026

Uh oh!

coveralls commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coverage Report for CI Build 25801484257

Coverage decreased (-0.06%) to 85.648%

Details

Uncovered Changes

Coverage Regressions

Coverage Stats

💛 - Coveralls

Uh oh!

benjamin-stacks left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

benjamin-stacks May 5, 2026

Choose a reason for hiding this comment

Uh oh!

brice-stacks May 5, 2026

Choose a reason for hiding this comment

Uh oh!

brice-stacks May 5, 2026

Choose a reason for hiding this comment

Uh oh!

brice-stacks commented May 5, 2026

Uh oh!

brice-stacks commented May 5, 2026

Uh oh!

brice-stacks commented May 5, 2026

Uh oh!

brice-stacks commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

coveralls commented May 4, 2026 •

edited

Loading

benjamin-stacks left a comment •

edited

Loading