Skip to content

Guard auto-readahead against exhausted index iter (#14631)#14631

Closed
archang19 wants to merge 1 commit intofacebook:mainfrom
archang19:export-D101217429
Closed

Guard auto-readahead against exhausted index iter (#14631)#14631
archang19 wants to merge 1 commit intofacebook:mainfrom
archang19:export-D101217429

Conversation

@archang19
Copy link
Copy Markdown
Contributor

@archang19 archang19 commented Apr 17, 2026

Summary:

Observed crash from ZippyDB iterate scans:

onFatalError: unexpected error : Invariant SignalFatal:
fbcode/zippydb/server/Main.cpp:handleSigToFatalCommon:281:FATAL false
failed: Signal 11 (SIGSEGV) at address 0x0 ... method: MultiIterate ...
RocksDB Activity: iterate

Key stack frames:
rocksdb::BlockBasedTableIterator::IsNextBlockOutOfReadaheadBound()
fbcode/rocksdb/src/table/block_based/block_based_table_iterator.h:471
rocksdb::BlockBasedTableIterator::BlockCacheLookupForReadAheadSize()
fbcode/rocksdb/src/table/block_based/block_based_table_iterator.cc:878
rocksdb::FinalizeAsyncRead()
rocksdb::FilePrefetchBuffer::PollIfNeeded()
rocksdb::BlockFetcher::ReadBlockContents()
rocksdb::BlockBasedTableIterator::InitDataBlock()
rocksdb::DBIter::Next()

Reviewed By: xingbowang

Differential Revision: D101217429

@meta-cla meta-cla Bot added the CLA Signed label Apr 17, 2026
@meta-codesync
Copy link
Copy Markdown

meta-codesync Bot commented Apr 17, 2026

@archang19 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D101217429.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 17, 2026

✅ clang-tidy: No findings on changed lines

Completed in 169.8s.

Summary:

Observed crash from ZippyDB iterate scans:

```
onFatalError: unexpected error : Invariant SignalFatal:
fbcode/zippydb/server/Main.cpp:handleSigToFatalCommon:281:FATAL false
failed: Signal 11 (SIGSEGV) at address 0x0 ... method: MultiIterate ...
RocksDB Activity: iterate
```

Key stack frames:
  rocksdb::BlockBasedTableIterator::IsNextBlockOutOfReadaheadBound()
    fbcode/rocksdb/src/table/block_based/block_based_table_iterator.h:471
  rocksdb::BlockBasedTableIterator::BlockCacheLookupForReadAheadSize()
    fbcode/rocksdb/src/table/block_based/block_based_table_iterator.cc:878
  rocksdb::FinalizeAsyncRead()
  rocksdb::FilePrefetchBuffer::PollIfNeeded()
  rocksdb::BlockFetcher::ReadBlockContents()
  rocksdb::BlockBasedTableIterator::InitDataBlock()
  rocksdb::DBIter::Next()

Reviewed By: xingbowang

Differential Revision: D101217429
@meta-codesync meta-codesync Bot changed the title Guard auto-readahead against exhausted index iter Guard auto-readahead against exhausted index iter (#14631) Apr 17, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 17, 2026

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit ee54c15


Code Review: Guard auto-readahead against exhausted index iter (#14631)

Summary

This PR fixes a production SIGSEGV crash in BlockBasedTableIterator::IsNextBlockOutOfReadaheadBound() caused by dereferencing an exhausted index_iter_. The fix adds a validity guard in BlockCacheLookupForReadAheadSize() and a debug assert in IsNextBlockOutOfReadaheadBound().

Verdict: APPROVE — The fix is correct, minimal, and addresses a real production crash with negligible performance impact.


Root Cause Analysis

The crash occurs through this chain:

  1. SeekImpl() enables readahead_cache_lookup_ and binds BlockCacheLookupForReadAheadSize as a callback (block_based_table_iterator.cc:447-450).
  2. During InitDataBlock(), BlockPrefetcher::PrefetchIfNeeded() triggers FilePrefetchBuffer::ReadAheadSizeTuning() which invokes the callback.
  3. Inside the callback, the while loop at line 897 advances index_iter_ via Next() (line 954), potentially exhausting it.
  4. On a subsequent callback invocation, line 878 calls IsNextBlockOutOfReadaheadBound() without checking index_iter_->Valid().
  5. IsNextBlockOutOfReadaheadBound() dereferences index_iter_->user_key() on an invalid iterator → SIGSEGV.

Findings

Finding 1: Fix is Correct and Sufficient — POSITIVE

The guard at line 876 prevents the crash at the only vulnerable call site. The early return with unmodified offsets is correct: ReadAheadSizeTuning() proceeds with the original calculated readahead range. FindBlockForward() already guards at line 713 before subsequent index_iter_ access.

Finding 2: Consider UNLIKELY Hint — SUGGESTION

The condition !index_iter_->Valid() is rare (<1% of invocations). if (UNLIKELY(!index_iter_->Valid())) return; would be consistent with RocksDB patterns.

Finding 3: Silent Handling of Error-State Iterators — LOW

The guard treats exhaustion and error states identically. This is safe because: errors are already handled at lines 962-964 in debug, and BlockBasedTableIterator::status() propagates errors to callers.

Finding 4: Test Coverage is Adequate — LOW

The new test reproduces the crash scenario correctly. Optional improvements: add prefix-based and multi-SST variants.

Finding 5: No API/Serialization/Compatibility Impact — NONE


Debate Resolution

Several initially-raised concerns were disproven:

  • "InitDataBlock() second crash": data_block_handle is captured at line 420 BEFORE the callback runs at line 459.
  • "Offsets past EOF": ReadAheadSizeTuning() initializes offsets from validated values; the callback only narrows them.
  • "Async state corruption": Callback is invoked synchronously; no partial async state.
  • "Missing ResetPreviousBlockOffset()": Harmless when iterator is exhausted.

Full review written to review-findings.md.


ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /claude-review [context] — Request a code review
  • /claude-query <question> — Ask about the PR or codebase

@meta-codesync meta-codesync Bot closed this in 3ef8b1b Apr 17, 2026
@meta-codesync
Copy link
Copy Markdown

meta-codesync Bot commented Apr 17, 2026

This pull request has been merged in 3ef8b1b.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant