Skip to content

Parquet RleDecoder::get_batch_with_dict panics on oob dictionary indices #9434

@jhorstmann

Description

@jhorstmann

Describe the bug

The RleDecoder::get_batch_with_dict function panics when it encounters dictionary indices that are out of bounds.

I created two sample files that trigger this, one with an rle-encoded dictionary key, one with a bitpacked key:

oob_bitpacked_value.zip
oob_rle_value.zip

To Reproduce

$ parquet-read oob_bitpacked_value.parquet

thread 'main' (228557) panicked at parquet/src/encodings/rle.rs:500:58:
index out of bounds: the len is 1 but the index is 18446744073709551487
$ parquet-read oob_rle_value.parquet

thread 'main' (228807) panicked at parquet/src/encodings/rle.rs:468:34:
index out of bounds: the len is 1 but the index is 4294967167

Expected behavior

Reading these invalid files should return an Result::Err instead of panicking.

Additional context

A fix is already in progress in #9365, these files could be added there as a unit test.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions