Skip to content

[regression] Error with adaptive predicate pushdown: "Invalid offset in sparse column chunk data: 754, no matching page found." #9239

@erratic-pattern

Description

@erratic-pattern

Describe the bug

Error:

Parquet error: Invalid offset in sparse column chunk data: 754, no matching page found.
If you are using a `SelectionStrategyPolicy::Mask`, ensure that the OffsetIndex is provided when creating the InMemoryRowGroup.

Occurs when:

  1. A predicate uses RowSelectionStrategy::Selectors with a RowSelector list that skips an entire page.
  2. Another predicate uses RowSelectionStrategy::Mask by triggering the mask run-length threshold
  3. The column with RowSelectionStrategy::Mask is not in the output projection, so should_force_selectors does not force it to use RowSelectionStrategy::Selectors
  4. The mask strategy attempts to fetch pages that were skipped, resulting in an error

To Reproduce

A minimal reproducer is available at: https://github.com/erratic-pattern/parquet_mask_strategy_missing_pages

git clone https://github.com/erratic-pattern/parquet_mask_strategy_missing_pages
cd parquet_mask_strategy_missing_pages
cargo test

The test uses a parquet file with:

  • 2 row groups, 300 rows each
  • Tag column with values 'a', 'b', 'c' sorted (100 rows each)
  • Time column with alternating in-range/out-of-range values
  • Page size set so tag='b' section contains at least one full page

The test simulates a query like SELECT tag WHERE tag IN ('a', 'c') AND time >= X AND time < Y with three predicates:

  1. tag IN ('a', 'c') - creates initial selection [select 100, skip 100, select 100]
  2. time >= X - creates sparse selection, pages fetched as Sparse
  3. time < Y - triggers Mask strategy due to sparse selection from predicate 2

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions