Skip to content

Fix FileNotFoundException due to race in NativeFileSorter#5921

Open
subhramit wants to merge 1 commit intospotify:mainfrom
subhramit:fix-nativefilesorter-race
Open

Fix FileNotFoundException due to race in NativeFileSorter#5921
subhramit wants to merge 1 commit intospotify:mainfrom
subhramit:fix-nativefilesorter-race

Conversation

@subhramit
Copy link
Copy Markdown

@subhramit subhramit commented Apr 8, 2026

Closes #5765

The bug apparently was a race due to the lazy iterator escaping its DoFn's lifecycle.
The flow goes -

  1. SortValues internally wrote spilled data to temp files
  2. Beam's SortValuesDoFn deleted those temp files when it finished processing the element (end of @ProcessElement)
  3. The KvToTuple map returns a lazy Iterable[(K2, V)] that wrapped the DecodingIterable
  4. Downstream List.prependedAll called .iterator on the lazy wrapper after the DoFn has already cleaned up the temp files, causing the FileNotFoundException
List.prependedAll          <- forces the iterator here (lazy consumer)
  -> anon Iterable.iterator
    -> NativeFileSorter.mergeSortedFiles  <- file already deleted

Fix was simply to force full consumption of the DecodingIterable while the DoFn is still alive and the temp files still exist.

Now by the time .map returns a (K1, Iterable[(K2, V)]), the Iterable is a plain Scala Vector with no references to Beam internals or temp files.

Signed-off-by: subhramit <subhramit.bb@live.in>
@subhramit subhramit changed the title Fix FileNotFoundException due to race in NativeFileSorter Fix FileNotFoundException due to race in NativeFileSorter Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

java.io.FileNotFoundException in NativeFileSorter

1 participant