Skip to content

Conversation

@sdf-jkl
Copy link
Contributor

@sdf-jkl sdf-jkl commented Jan 9, 2026

Which issue does this PR close?

Rationale for this change

Check issue.

What changes are included in this PR?

  • Made next_mask_chunk page aware.

  • By adding page_offsets to ParquetRecordBatchReader

Are these changes tested?

Should be covered by existing tests from #8733

Are there any user-facing changes?

@github-actions github-actions bot added the parquet Changes to the parquet crate label Jan 9, 2026
@sdf-jkl sdf-jkl changed the title Bitmask skip page [Parquet] Support skipping pages with mask based evaluation Jan 9, 2026
@sdf-jkl sdf-jkl marked this pull request as ready for review January 9, 2026 20:41
@sdf-jkl
Copy link
Contributor Author

sdf-jkl commented Jan 9, 2026

@alamb, @hhhizzz, @tustvold please review when available.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @sdf-jkl -- this actually makes a lot of sense to me 👏

I have a few concerns:

  1. I am worried about the performance overhead of this approach (copying the page index and the loop for each batch) -- I will run some benchmarks to asses this
  2. I do wonder if we have test coverage for this entire situation -- in particular, do we have tests that repeatedly call next_mask_chunk after the first page and make sure we get the right rows?

If the performance looks good, I think we should add some more tests -- maybe @hhhizzz has some ideas on how to do this (or I think I can try and find some time to help out / work with codex to do so)

/// Using the row selection to skip(4), page2 won't be read at all, so in this
/// case we can't decode all the rows and apply a mask. To correctly apply the
/// bit mask, we need all 6 values be read, but page2 is not in memory.
fn override_selector_strategy_if_needed(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice -- the idea is to avoid this function 👍

array_reader,
schema: Arc::new(schema),
read_plan,
page_offsets: page_offsets.map(|slice| Arc::new(slice.to_vec())),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think this will effectively will copy the entire OffsetIndexMetadataData structure (which I worry could be quite large)

I wonder if we need to find a way to avoid this (e.g. making the entire thing Arc'd in https://github.com/apache/arrow-rs/blob/67e04e758f1e62ec3d78d2f678daf433a4c54e30/parquet/src/file/metadata/mod.rs#L197-L196 somehow 🤔 )

Copy link
Contributor Author

@sdf-jkl sdf-jkl Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could store only the &Vec<PageLocation> instead of the entire OffsetIndexMetadataData df9a493

while cursor < mask.len() && selected_rows < batch_size {
let mut page_end = mask.len();
if let Some(pages) = page_locations {
for loc in pages {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also a little worried that this loop will take too long (it is O(N^2) in the number of pages as each time it looks through all pages

Maybe we could potentially add a PageLocationIterator to the cursor itself (so we know where to pick up)

Copy link
Contributor Author

@sdf-jkl sdf-jkl Jan 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a binary search through a vec of page offsets? Would have to construct the vec once beforehand to keep us from rebuilding it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in df9a493

@alamb
Copy link
Contributor

alamb commented Jan 10, 2026

run benchmark arrow_reader_clickbench arrow_reader_row_filter

@alamb-ghbot
Copy link

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing bitmask-skip-page (5395dbf) to 964daec diff
BENCH_NAME=arrow_reader_clickbench
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench arrow_reader_clickbench
BENCH_FILTER=
BENCH_BRANCH_NAME=bitmask-skip-page
Results will be posted here when complete

@apache apache deleted a comment from alamb-ghbot Jan 10, 2026
@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

group                                bitmask-skip-page                      main
-----                                -----------------                      ----
arrow_reader_clickbench/async/Q1     1.02      2.4±0.05ms        ? ?/sec    1.00      2.3±0.02ms        ? ?/sec
arrow_reader_clickbench/async/Q10    1.00     12.7±0.40ms        ? ?/sec    1.02     12.9±0.30ms        ? ?/sec
arrow_reader_clickbench/async/Q11    1.00     14.6±0.34ms        ? ?/sec    1.00     14.7±0.21ms        ? ?/sec
arrow_reader_clickbench/async/Q12    1.00     25.9±0.67ms        ? ?/sec    1.00     25.9±0.32ms        ? ?/sec
arrow_reader_clickbench/async/Q13    1.00     31.2±0.53ms        ? ?/sec    1.00     31.2±0.75ms        ? ?/sec
arrow_reader_clickbench/async/Q14    1.01     28.5±0.83ms        ? ?/sec    1.00     28.4±0.43ms        ? ?/sec
arrow_reader_clickbench/async/Q19    1.02      5.4±0.12ms        ? ?/sec    1.00      5.3±0.10ms        ? ?/sec
arrow_reader_clickbench/async/Q20    1.18    145.8±1.75ms        ? ?/sec    1.00    123.1±0.81ms        ? ?/sec
arrow_reader_clickbench/async/Q21    1.06    166.9±1.86ms        ? ?/sec    1.00    157.3±2.95ms        ? ?/sec
arrow_reader_clickbench/async/Q22    1.03   324.0±13.97ms        ? ?/sec    1.00    313.1±8.51ms        ? ?/sec
arrow_reader_clickbench/async/Q23    1.00    407.8±3.02ms        ? ?/sec    1.00    408.5±5.27ms        ? ?/sec
arrow_reader_clickbench/async/Q24    1.00     34.2±0.38ms        ? ?/sec    1.00     34.1±0.40ms        ? ?/sec
arrow_reader_clickbench/async/Q27    1.00    100.6±1.00ms        ? ?/sec    1.00    100.2±1.29ms        ? ?/sec
arrow_reader_clickbench/async/Q28    1.01     99.0±0.92ms        ? ?/sec    1.00     98.3±0.69ms        ? ?/sec
arrow_reader_clickbench/async/Q30    1.01     30.7±0.58ms        ? ?/sec    1.00     30.5±0.64ms        ? ?/sec
arrow_reader_clickbench/async/Q36    1.01    109.3±0.58ms        ? ?/sec    1.00    108.5±0.67ms        ? ?/sec
arrow_reader_clickbench/async/Q37    1.00     85.0±1.17ms        ? ?/sec    1.00     84.8±0.39ms        ? ?/sec
arrow_reader_clickbench/async/Q38    1.00     32.9±0.47ms        ? ?/sec    1.00     32.8±0.30ms        ? ?/sec
arrow_reader_clickbench/async/Q39    1.00     46.2±0.47ms        ? ?/sec    1.00     46.4±0.40ms        ? ?/sec
arrow_reader_clickbench/async/Q40    1.02     27.8±0.60ms        ? ?/sec    1.00     27.2±0.49ms        ? ?/sec
arrow_reader_clickbench/async/Q41    1.00     22.4±0.72ms        ? ?/sec    1.00     22.5±0.35ms        ? ?/sec
arrow_reader_clickbench/async/Q42    1.01     11.1±0.09ms        ? ?/sec    1.00     11.1±0.30ms        ? ?/sec
arrow_reader_clickbench/sync/Q1      1.01      2.1±0.04ms        ? ?/sec    1.00      2.1±0.01ms        ? ?/sec
arrow_reader_clickbench/sync/Q10     1.00      9.8±0.07ms        ? ?/sec    1.01      9.9±0.06ms        ? ?/sec
arrow_reader_clickbench/sync/Q11     1.00     11.4±0.25ms        ? ?/sec    1.00     11.4±0.08ms        ? ?/sec
arrow_reader_clickbench/sync/Q12     1.03     35.0±1.40ms        ? ?/sec    1.00     34.0±0.45ms        ? ?/sec
arrow_reader_clickbench/sync/Q13     1.00     38.8±0.55ms        ? ?/sec    1.24     48.0±1.00ms        ? ?/sec
arrow_reader_clickbench/sync/Q14     1.00     36.2±0.60ms        ? ?/sec    1.22     44.4±0.59ms        ? ?/sec
arrow_reader_clickbench/sync/Q19     1.02      4.4±0.02ms        ? ?/sec    1.00      4.3±0.03ms        ? ?/sec
arrow_reader_clickbench/sync/Q20     1.00    177.0±2.01ms        ? ?/sec    1.01    177.9±0.95ms        ? ?/sec
arrow_reader_clickbench/sync/Q21     1.00    234.5±2.00ms        ? ?/sec    1.01    236.3±2.21ms        ? ?/sec
arrow_reader_clickbench/sync/Q22     1.00    479.1±3.47ms        ? ?/sec    1.01    483.5±4.64ms        ? ?/sec
arrow_reader_clickbench/sync/Q23     1.02   444.0±13.03ms        ? ?/sec    1.00   436.1±13.35ms        ? ?/sec
arrow_reader_clickbench/sync/Q24     1.00     44.8±0.67ms        ? ?/sec    1.03     46.1±0.41ms        ? ?/sec
arrow_reader_clickbench/sync/Q27     1.00    155.2±1.67ms        ? ?/sec    1.00    155.1±1.40ms        ? ?/sec
arrow_reader_clickbench/sync/Q28     1.00    149.6±1.04ms        ? ?/sec    1.00    149.8±1.04ms        ? ?/sec
arrow_reader_clickbench/sync/Q30     1.01     31.4±0.43ms        ? ?/sec    1.00     31.1±0.56ms        ? ?/sec
arrow_reader_clickbench/sync/Q36     1.00    153.8±1.14ms        ? ?/sec    1.00    154.0±1.38ms        ? ?/sec
arrow_reader_clickbench/sync/Q37     1.00     87.9±1.08ms        ? ?/sec    1.01     88.9±0.72ms        ? ?/sec
arrow_reader_clickbench/sync/Q38     1.00     29.1±0.39ms        ? ?/sec    1.01     29.4±0.24ms        ? ?/sec
arrow_reader_clickbench/sync/Q39     1.00     33.9±0.42ms        ? ?/sec    1.00     34.0±0.26ms        ? ?/sec
arrow_reader_clickbench/sync/Q40     1.00     26.6±0.33ms        ? ?/sec    1.00     26.6±0.60ms        ? ?/sec
arrow_reader_clickbench/sync/Q41     1.01     29.1±0.61ms        ? ?/sec    1.00     28.8±0.33ms        ? ?/sec
arrow_reader_clickbench/sync/Q42     1.00     12.7±0.10ms        ? ?/sec    1.00     12.6±0.08ms        ? ?/sec

@alamb-ghbot
Copy link

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing bitmask-skip-page (5395dbf) to 964daec diff
BENCH_NAME=arrow_reader_row_filter
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench arrow_reader_row_filter
BENCH_FILTER=
BENCH_BRANCH_NAME=bitmask-skip-page
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

group                                                                                bitmask-skip-page                      main
-----                                                                                -----------------                      ----
arrow_reader_row_filter/float64 <= 99.0/all_columns/async                            1.00  1711.1±12.59µs        ? ?/sec    1.01  1725.2±10.04µs        ? ?/sec
arrow_reader_row_filter/float64 <= 99.0/all_columns/sync                             1.00  1807.9±10.76µs        ? ?/sec    1.03  1865.4±19.96µs        ? ?/sec
arrow_reader_row_filter/float64 <= 99.0/exclude_filter_column/async                  1.00  1571.9±23.10µs        ? ?/sec    1.02  1607.7±31.74µs        ? ?/sec
arrow_reader_row_filter/float64 <= 99.0/exclude_filter_column/sync                   1.00  1548.3±27.08µs        ? ?/sec    1.02  1575.1±27.17µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/all_columns/async              1.00   1528.8±7.49µs        ? ?/sec    1.01  1548.4±19.21µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/all_columns/sync               1.00  1673.9±17.01µs        ? ?/sec    1.02  1699.9±13.19µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/exclude_filter_column/async    1.00  1354.8±20.42µs        ? ?/sec    1.01  1361.8±13.37µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/exclude_filter_column/sync     1.00  1348.2±12.04µs        ? ?/sec    1.01   1363.6±9.07µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/all_columns/async                             1.00  1719.1±21.06µs        ? ?/sec    1.00   1712.1±8.82µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/all_columns/sync                              1.00  1808.9±15.74µs        ? ?/sec    1.01  1829.3±13.16µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/exclude_filter_column/async                   1.00  1571.7±13.46µs        ? ?/sec    1.00  1572.2±11.69µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/exclude_filter_column/sync                    1.00  1534.2±12.04µs        ? ?/sec    1.01  1552.9±10.49µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/all_columns/async                              1.00    918.8±8.99µs        ? ?/sec    1.03   943.7±38.25µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/all_columns/sync                               1.00   864.1±17.34µs        ? ?/sec    1.02   878.3±16.94µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/exclude_filter_column/async                    1.00    839.9±5.83µs        ? ?/sec    1.02    855.1±8.58µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/exclude_filter_column/sync                     1.00    848.7±5.97µs        ? ?/sec    1.03    870.4±9.58µs        ? ?/sec
arrow_reader_row_filter/int64 > 90/all_columns/async                                 1.27      3.5±0.06ms        ? ?/sec    1.00      2.8±0.03ms        ? ?/sec
arrow_reader_row_filter/int64 > 90/all_columns/sync                                  1.00      3.6±0.04ms        ? ?/sec    1.00      3.6±0.04ms        ? ?/sec
arrow_reader_row_filter/int64 > 90/exclude_filter_column/async                       1.02      2.6±0.02ms        ? ?/sec    1.00      2.6±0.03ms        ? ?/sec
arrow_reader_row_filter/int64 > 90/exclude_filter_column/sync                        1.00      2.3±0.03ms        ? ?/sec    1.00      2.3±0.06ms        ? ?/sec
arrow_reader_row_filter/ts < 9000/all_columns/async                                  1.00  1957.2±24.32µs        ? ?/sec    1.01  1972.9±32.45µs        ? ?/sec
arrow_reader_row_filter/ts < 9000/all_columns/sync                                   1.00      2.0±0.02ms        ? ?/sec    1.01      2.1±0.04ms        ? ?/sec
arrow_reader_row_filter/ts < 9000/exclude_filter_column/async                        1.00  1809.3±44.74µs        ? ?/sec    1.00  1807.9±16.81µs        ? ?/sec
arrow_reader_row_filter/ts < 9000/exclude_filter_column/sync                         1.00  1797.3±13.53µs        ? ?/sec    1.01  1810.9±31.64µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/all_columns/async                                 1.00  1249.1±15.21µs        ? ?/sec    1.03  1283.2±13.68µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/all_columns/sync                                  1.00   1252.4±9.50µs        ? ?/sec    1.03  1289.5±11.20µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/exclude_filter_column/async                       1.00  1128.0±13.41µs        ? ?/sec    1.03   1157.9±7.78µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/exclude_filter_column/sync                        1.00   1138.9±9.11µs        ? ?/sec    1.03  1172.3±40.63µs        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/all_columns/async                             1.02      3.3±0.04ms        ? ?/sec    1.00      3.2±0.03ms        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/all_columns/sync                              1.02      3.6±0.02ms        ? ?/sec    1.00      3.6±0.04ms        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/exclude_filter_column/async                   1.01      2.8±0.01ms        ? ?/sec    1.00      2.8±0.01ms        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/exclude_filter_column/sync                    1.00      2.5±0.02ms        ? ?/sec    1.00      2.5±0.07ms        ? ?/sec

@Dandandan
Copy link
Contributor

run benchmark arrow_reader_clickbench arrow_reader_row_filter

@alamb-ghbot
Copy link

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing bitmask-skip-page (6919196) to 964daec diff
BENCH_NAME=arrow_reader_clickbench
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench arrow_reader_clickbench
BENCH_FILTER=
BENCH_BRANCH_NAME=bitmask-skip-page
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

group                                bitmask-skip-page                      main
-----                                -----------------                      ----
arrow_reader_clickbench/async/Q1     1.01      2.3±0.03ms        ? ?/sec    1.00      2.3±0.02ms        ? ?/sec
arrow_reader_clickbench/async/Q10    1.00     12.7±0.48ms        ? ?/sec    1.00     12.7±0.19ms        ? ?/sec
arrow_reader_clickbench/async/Q11    1.00     14.3±0.39ms        ? ?/sec    1.01     14.4±0.35ms        ? ?/sec
arrow_reader_clickbench/async/Q12    1.02     25.7±0.75ms        ? ?/sec    1.00     25.2±0.50ms        ? ?/sec
arrow_reader_clickbench/async/Q13    1.02     31.3±0.57ms        ? ?/sec    1.00     30.8±0.52ms        ? ?/sec
arrow_reader_clickbench/async/Q14    1.03     28.4±0.63ms        ? ?/sec    1.00     27.7±0.24ms        ? ?/sec
arrow_reader_clickbench/async/Q19    1.00      5.3±0.06ms        ? ?/sec    1.00      5.3±0.12ms        ? ?/sec
arrow_reader_clickbench/async/Q20    1.00    142.9±2.30ms        ? ?/sec    1.03    147.1±6.53ms        ? ?/sec
arrow_reader_clickbench/async/Q21    1.00    164.3±1.11ms        ? ?/sec    1.08    177.8±2.83ms        ? ?/sec
arrow_reader_clickbench/async/Q22    1.02    317.8±9.85ms        ? ?/sec    1.00   310.7±35.18ms        ? ?/sec
arrow_reader_clickbench/async/Q23    1.00    402.9±3.18ms        ? ?/sec    1.02    412.2±2.91ms        ? ?/sec
arrow_reader_clickbench/async/Q24    1.03     34.7±0.65ms        ? ?/sec    1.00     33.7±0.34ms        ? ?/sec
arrow_reader_clickbench/async/Q27    1.00     98.0±0.64ms        ? ?/sec    1.03    100.9±0.60ms        ? ?/sec
arrow_reader_clickbench/async/Q28    1.00     98.4±0.54ms        ? ?/sec    1.02    100.0±1.22ms        ? ?/sec
arrow_reader_clickbench/async/Q30    1.00     30.7±0.35ms        ? ?/sec    1.00     30.6±0.87ms        ? ?/sec
arrow_reader_clickbench/async/Q36    1.00    108.0±1.62ms        ? ?/sec    1.02    109.7±0.79ms        ? ?/sec
arrow_reader_clickbench/async/Q37    1.00     84.3±0.75ms        ? ?/sec    1.02     86.1±0.61ms        ? ?/sec
arrow_reader_clickbench/async/Q38    1.00     32.7±0.28ms        ? ?/sec    1.01     33.1±0.42ms        ? ?/sec
arrow_reader_clickbench/async/Q39    1.00     46.3±0.48ms        ? ?/sec    1.00     46.4±0.49ms        ? ?/sec
arrow_reader_clickbench/async/Q40    1.02     28.3±0.49ms        ? ?/sec    1.00     27.7±0.34ms        ? ?/sec
arrow_reader_clickbench/async/Q41    1.03     23.1±0.69ms        ? ?/sec    1.00     22.4±0.43ms        ? ?/sec
arrow_reader_clickbench/async/Q42    1.02     11.3±0.10ms        ? ?/sec    1.00     11.1±0.11ms        ? ?/sec
arrow_reader_clickbench/sync/Q1      1.00      2.1±0.01ms        ? ?/sec    1.00      2.1±0.01ms        ? ?/sec
arrow_reader_clickbench/sync/Q10     1.00      9.8±0.18ms        ? ?/sec    1.00      9.8±0.08ms        ? ?/sec
arrow_reader_clickbench/sync/Q11     1.04     11.7±0.39ms        ? ?/sec    1.00     11.3±0.21ms        ? ?/sec
arrow_reader_clickbench/sync/Q12     1.11     36.2±1.97ms        ? ?/sec    1.00     32.5±0.39ms        ? ?/sec
arrow_reader_clickbench/sync/Q13     1.00     40.0±0.55ms        ? ?/sec    1.15     45.9±0.73ms        ? ?/sec
arrow_reader_clickbench/sync/Q14     1.00     37.0±0.59ms        ? ?/sec    1.15     42.5±0.81ms        ? ?/sec
arrow_reader_clickbench/sync/Q19     1.03      4.4±0.05ms        ? ?/sec    1.00      4.3±0.03ms        ? ?/sec
arrow_reader_clickbench/sync/Q20     1.00    172.6±1.37ms        ? ?/sec    1.02    176.7±1.89ms        ? ?/sec
arrow_reader_clickbench/sync/Q21     1.00    230.7±2.59ms        ? ?/sec    1.01    233.5±3.11ms        ? ?/sec
arrow_reader_clickbench/sync/Q22     1.00    474.0±5.22ms        ? ?/sec    1.01    478.3±3.39ms        ? ?/sec
arrow_reader_clickbench/sync/Q23     1.00   440.5±13.95ms        ? ?/sec    1.00   442.2±17.34ms        ? ?/sec
arrow_reader_clickbench/sync/Q24     1.05     46.7±1.35ms        ? ?/sec    1.00     44.5±0.50ms        ? ?/sec
arrow_reader_clickbench/sync/Q27     1.00    153.1±1.65ms        ? ?/sec    1.01    154.2±0.97ms        ? ?/sec
arrow_reader_clickbench/sync/Q28     1.01    151.2±2.35ms        ? ?/sec    1.00    149.2±1.16ms        ? ?/sec
arrow_reader_clickbench/sync/Q30     1.09     32.8±0.69ms        ? ?/sec    1.00     30.0±0.60ms        ? ?/sec
arrow_reader_clickbench/sync/Q36     1.00    149.7±1.66ms        ? ?/sec    1.03    154.7±1.74ms        ? ?/sec
arrow_reader_clickbench/sync/Q37     1.00     86.4±0.84ms        ? ?/sec    1.03     89.0±1.67ms        ? ?/sec
arrow_reader_clickbench/sync/Q38     1.00     28.3±0.46ms        ? ?/sec    1.03     29.2±0.28ms        ? ?/sec
arrow_reader_clickbench/sync/Q39     1.00     32.7±0.34ms        ? ?/sec    1.04     33.9±0.26ms        ? ?/sec
arrow_reader_clickbench/sync/Q40     1.00     26.3±0.28ms        ? ?/sec    1.00     26.3±0.25ms        ? ?/sec
arrow_reader_clickbench/sync/Q41     1.01     29.1±0.32ms        ? ?/sec    1.00     28.7±0.29ms        ? ?/sec
arrow_reader_clickbench/sync/Q42     1.00     12.6±0.24ms        ? ?/sec    1.01     12.7±0.43ms        ? ?/sec

@alamb-ghbot
Copy link

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing bitmask-skip-page (6919196) to 964daec diff
BENCH_NAME=arrow_reader_row_filter
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench arrow_reader_row_filter
BENCH_FILTER=
BENCH_BRANCH_NAME=bitmask-skip-page
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

group                                                                                bitmask-skip-page                       main
-----                                                                                -----------------                       ----
arrow_reader_row_filter/float64 <= 99.0/all_columns/async                            1.00  1737.0±24.43µs        ? ?/sec     1.00  1740.6±12.25µs        ? ?/sec
arrow_reader_row_filter/float64 <= 99.0/all_columns/sync                             1.00  1841.2±16.88µs        ? ?/sec     1.00  1850.3±15.17µs        ? ?/sec
arrow_reader_row_filter/float64 <= 99.0/exclude_filter_column/async                  1.00  1595.2±23.94µs        ? ?/sec     1.00  1589.8±11.48µs        ? ?/sec
arrow_reader_row_filter/float64 <= 99.0/exclude_filter_column/sync                   1.00  1554.2±18.65µs        ? ?/sec     1.02  1579.9±11.57µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/all_columns/async              1.00  1550.7±10.67µs        ? ?/sec     1.01  1570.4±22.51µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/all_columns/sync               1.00  1710.2±13.28µs        ? ?/sec     1.02  1748.2±19.70µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/exclude_filter_column/async    1.00  1378.1±20.68µs        ? ?/sec     1.01  1388.9±17.42µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/exclude_filter_column/sync     1.00  1377.8±10.71µs        ? ?/sec     1.00  1382.7±12.99µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/all_columns/async                             1.01  1740.5±14.97µs        ? ?/sec     1.00  1725.8±18.37µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/all_columns/sync                              1.00  1842.9±13.91µs        ? ?/sec     1.00  1837.5±16.92µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/exclude_filter_column/async                   1.00  1596.0±16.08µs        ? ?/sec     1.00  1593.0±20.88µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/exclude_filter_column/sync                    1.00  1557.3±14.67µs        ? ?/sec     1.01  1575.3±26.16µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/all_columns/async                              1.01    922.3±8.84µs        ? ?/sec     1.00    911.4±9.30µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/all_columns/sync                               1.01   864.8±12.31µs        ? ?/sec     1.00   852.6±12.39µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/exclude_filter_column/async                    1.02   851.8±17.05µs        ? ?/sec     1.00    835.0±7.01µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/exclude_filter_column/sync                     1.01   855.7±10.16µs        ? ?/sec     1.00   848.3±26.47µs        ? ?/sec
arrow_reader_row_filter/int64 > 90/all_columns/async                                 1.00      2.7±0.02ms        ? ?/sec     1.42      3.9±0.04ms        ? ?/sec
arrow_reader_row_filter/int64 > 90/all_columns/sync                                  1.00      3.6±0.08ms        ? ?/sec     1.06      3.8±0.17ms        ? ?/sec
arrow_reader_row_filter/int64 > 90/exclude_filter_column/async                       1.00      2.6±0.04ms        ? ?/sec     1.01      2.7±0.11ms        ? ?/sec
arrow_reader_row_filter/int64 > 90/exclude_filter_column/sync                        1.00      2.3±0.03ms        ? ?/sec     1.05      2.4±0.06ms        ? ?/sec
arrow_reader_row_filter/ts < 9000/all_columns/async                                  1.00  1967.4±107.56µs        ? ?/sec    1.02      2.0±0.03ms        ? ?/sec
arrow_reader_row_filter/ts < 9000/all_columns/sync                                   1.00      2.0±0.01ms        ? ?/sec     1.02      2.1±0.06ms        ? ?/sec
arrow_reader_row_filter/ts < 9000/exclude_filter_column/async                        1.00  1772.0±11.49µs        ? ?/sec     1.02  1804.6±13.86µs        ? ?/sec
arrow_reader_row_filter/ts < 9000/exclude_filter_column/sync                         1.00  1783.5±17.77µs        ? ?/sec     1.01  1808.2±11.79µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/all_columns/async                                 1.00  1257.4±13.21µs        ? ?/sec     1.02   1277.2±8.80µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/all_columns/sync                                  1.00  1275.4±11.35µs        ? ?/sec     1.00  1270.5±14.14µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/exclude_filter_column/async                       1.00  1162.3±13.90µs        ? ?/sec     1.00  1157.2±23.30µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/exclude_filter_column/sync                        1.02  1172.1±10.33µs        ? ?/sec     1.00  1152.2±21.33µs        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/all_columns/async                             1.00      3.3±0.03ms        ? ?/sec     1.00      3.3±0.04ms        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/all_columns/sync                              1.00      3.6±0.05ms        ? ?/sec     1.01      3.6±0.02ms        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/exclude_filter_column/async                   1.00      2.8±0.02ms        ? ?/sec     1.01      2.8±0.08ms        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/exclude_filter_column/sync                    1.00      2.5±0.01ms        ? ?/sec     1.01      2.6±0.03ms        ? ?/sec

@sdf-jkl
Copy link
Contributor Author

sdf-jkl commented Jan 13, 2026

@alamb @Dandandan clickbench q12, 24, 30 show some degradation, but everything else looks like an overall improvement.


let reader = ParquetRecordBatchReader::new(array_reader, plan);
let reader =
ParquetRecordBatchReader::new(array_reader, plan, page_offsets.cloned());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cloned may cause extra expense here, can we use Arc<[PageLocation]> to avoid that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a big api change to make PageLocation or OffsetIndexMetadataData an Arc inside ParquetMetaData.

If we'd want to make that change, I can open an issue and work up a PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @hhhizzz that copying the offsets here is not good

I thought about it some more, and I think the reason the copy is currently needed is that the decision of should the page be skipped is postponed until the next MaskChunk is needed

One potential idea I had to avoid this, is to use the page index in the ReadPlanBuilder when building, rather than pass in the page index to every call for next_batch.

So maybe that would look something like extending MaskCursor from

/// Cursor for iterating a mask-backed [`RowSelection`]
///
/// This is best for dense selections where there are many small skips
/// or selections. For example, selecting every other row.
#[derive(Debug)]
pub struct MaskCursor {
    mask: BooleanBuffer,
    /// Current absolute offset into the selection
    position: usize,
}

To also track what ranges should be skipped entirely. Maybe something like

#[derive(Debug)]
pub struct MaskCursor {
    mask: BooleanBuffer,
    /// Current absolute offset into the selection
    position: usize,
    /// Which row ranges should be skipped entirely?
    skip_ranges: Vec<Range<usize>>,
}

That I think would simplify the logic for next_mask_chunk significantly and it would avoid the need to copy the entire page inde

@hhhizzz
Copy link
Contributor

hhhizzz commented Jan 14, 2026

Thank you! @sdf-jkl , the code look great, just wondering if we could add more Unit tests.

@hhhizzz
Copy link
Contributor

hhhizzz commented Jan 14, 2026

Here's the exists test:

async fn test_row_filter_full_page_skip_is_handled_async() {

I think we can just add one more UT to test the skipping page with RowSelectionPolicy set to Mask instead of Auto

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @sdf-jkl and @hhhizzz -- I took a look at this PR and it is looking like it is heading in the right direction

I had some structural suggestions and I also have an idea for some additional coverage (related to predicates).

Please let me know if you are willing to work on this, otherwise I am happy to take over this PR as well (given we are hitting the problem at work, and it is blocking our upgrade)

)
}) {
self.row_group_offset_index(row_group_idx)
.and_then(|columns| columns.first())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bug -- it reads the page offsets from the first column rater than the column being read

Maybe something like

   self.row_group_offset_index(row_group_idx).and_then(|columns| {
                        columns
                            .iter()
                            .enumerate()
                            .find(|(leaf_idx, _)| self.projection.leaf_included(*leaf_idx))
                            .map(|(_, column)| column.page_locations())

Copy link
Contributor Author

@sdf-jkl sdf-jkl Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't the page offsets be same for every column? It is, thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think even this should not work, because we actually need to keep page offsets for all projected columns and use it in ReadPlanBuilder(once we move it from ParquetRecordBatchReader)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I guess we go back to using the whole &[OffsetIndexMetaData]

Copy link
Contributor

@alamb alamb Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I plan to find some time this afternoon to work on this PR -- maybe I will come up with something

Copy link
Contributor Author

@sdf-jkl sdf-jkl Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another issue with the current implementation is that ParquetRecordBatchReader is working page aware using pages offsets from a single column.

However, read happens for all columns at once, using the same boolean mask(which is col chunk specific).
https://github.com/apache/arrow-rs/pull/9118/changes#diff-850b3a44587149637b8545f66603a2b1252959fd36f7ddc55f37d6b5357816c6L1403

It seems that supporting different page offsets for each col would require us to push page awareness further down into the arrow readers.


while !mask_cursor.is_empty() {
let Some(mask_chunk) = mask_cursor.next_mask_chunk(batch_size) else {
let Some(mask_chunk) = mask_cursor.next_mask_chunk(batch_size, page_locations)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect that this API needs to be extended -- it needs to be able to represent "skip the next N rows without trying to decode them"

As written here I think the first page that doesn't have any rows selected will return None (which will trigger the reader to think it is at the end of the file, even if there is data left)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reader only thinks it's the end of the file when no further rows remain in mask_cursor. Empty page is handled by initial skip in next_mask_chunk.


let reader = ParquetRecordBatchReader::new(array_reader, plan);
let reader =
ParquetRecordBatchReader::new(array_reader, plan, page_offsets.cloned());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @hhhizzz that copying the offsets here is not good

I thought about it some more, and I think the reason the copy is currently needed is that the decision of should the page be skipped is postponed until the next MaskChunk is needed

One potential idea I had to avoid this, is to use the page index in the ReadPlanBuilder when building, rather than pass in the page index to every call for next_batch.

So maybe that would look something like extending MaskCursor from

/// Cursor for iterating a mask-backed [`RowSelection`]
///
/// This is best for dense selections where there are many small skips
/// or selections. For example, selecting every other row.
#[derive(Debug)]
pub struct MaskCursor {
    mask: BooleanBuffer,
    /// Current absolute offset into the selection
    position: usize,
}

To also track what ranges should be skipped entirely. Maybe something like

#[derive(Debug)]
pub struct MaskCursor {
    mask: BooleanBuffer,
    /// Current absolute offset into the selection
    position: usize,
    /// Which row ranges should be skipped entirely?
    skip_ranges: Vec<Range<usize>>,
}

That I think would simplify the logic for next_mask_chunk significantly and it would avoid the need to copy the entire page inde

@sdf-jkl
Copy link
Contributor Author

sdf-jkl commented Jan 22, 2026

Definitely willing to work on this, thanks for the review and your input!

@alamb
Copy link
Contributor

alamb commented Jan 22, 2026

Awesome -- thanks @sdf-jkl -- I will switch focus for the rest of today and check back in tomorrow.

Comment on lines +2507 to +2510
let props = WriterProperties::builder()
.set_write_batch_size(2)
.set_data_page_row_count_limit(2)
.build();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reason why tests pass is because page offsets are the same for any column.

We limit pages by row_count, not by size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually the same in the new test too...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in order to get different page offsets we will need to use a data page byte limit and then different page sizes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how dict/rle encodings can affect this. Would having the first page dict or being able to compress multiple pages into one with rle make a difference in offsets?

@sdf-jkl
Copy link
Contributor Author

sdf-jkl commented Jan 23, 2026

@alamb It seems like I'm on to something with codex. The test passes, but I want to give it a read and a little polish first before sending it your way.

@sdf-jkl
Copy link
Contributor Author

sdf-jkl commented Jan 23, 2026

It seems like the issue was caused by different sized pages after all. Bigger types would have more smaller "finer" pages and smaller types would have less bigger "coarser" pages.

If the column with coarse pages was used to enable page awareness we would use it's page offsets.

                             ┏━━━━┓ ┌────────┐            ┌────────┐
- '1' means selected         ┃ 1  ┃ │ Row 0  │            │ Row 0  │
- '0' means filtered         ┃ 1  ┃ │ Row 1  │            │ Row 1  │
                             ┃ 0  ┃ │ Row 2  │  A Page 0  └────────┘
                             ┃ 0  ┃ │ Row 3  │            ┌────────┐
                             ┃ 0  ┃ │ Row 4  │            │ Row 2  │
                             ┃    ┃ └────────┘            │ Row 3  │  B Page 1 (skipped)
                             ┃    ┃ ┌────────┐            └────────┘
                             ┃ 0  ┃ │ Row 5  │            ┌────────┐
                             ┃ 0  ┃ │ Row 6  │  A Page 1  │ Row 4  │
                             ┃ 0  ┃ │ Row 7  │            │ Row 5  │  B Page 2 (skipped)
                             ┃ 1  ┃ │ Row 8  │            └────────┘
                             ┃ 1  ┃ │ Row 9  │            ┌────────┐
                             ┗━━━━┛ └────────┘            │ Row 6  │
                                                          │ Row 7  │  B Page 3 (skipped)
                                                          └────────┘
                                                          ┌────────┐
                                                          │ Row 8  │
                                                          │ Row 9  │  B Page 4 (fetched)
                                                          └────────┘

Mask chunking uses A's coarse boundary:
- Chunk 1 tries to read rows 0–4 (A Page 0)

But Column B has fine pages:
- rows 2–5 are in B Pages 1–2 (skipped)

→ The chunk crosses into unfetched B pages → invalid offset

In the example above col A with "coarse" pages overlaps with "finer" pages in Col B that were skipped during data fetch. This lead to the invalid offsets issue.

Comment on lines 183 to 184
/// Add offset index metadata for each column in a row group to this `ReadPlanBuilder`
pub fn with_offset_index_metadata(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the offsets of the column with the smallest number of rows per page should prevent the invalid offset issue from happening.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I came up with a counter example where taking offsets from the col with finest offsets doesn't work.

                             ┏━━━━┓ ┌────────┐            ┌────────┐
- '1' means selected         ┃ 0  ┃ │ Row 0  │            │ Row 0  │
- '0' means filtered         ┃ 0  ┃ │ Row 1  │  A Page 0  │ Row 1  │
                             ┃ 0  ┃ │ Row 2  │ (skipped)  │ Row 2  │
                             ┃    ┃ └────────┘            │ Row 3  │  B Page 0
                             ┃ 1  ┃ ┌────────┐            └────────┘
                             ┃ 1  ┃ │ Row 3  │  A Page 1  ┌────────┐
                             ┃ 1  ┃ │ Row 4  │ (fetched)  │ Row 4  │
                             ┃ 0  ┃ │ Row 5  │            │ Row 5  │
                             ┃    ┃ └────────┘            │ Row 6  │  B Page 1 (skipped)
                             ┃ 0  ┃ ┌────────┐            │ Row 7  │
                             ┃ 0  ┃ │ Row 6  │  A Page 2  └────────┘
                             ┃ 0  ┃ │ Row 7  │            ┌────────┐
                             ┃ 0  ┃ │ Row 8  │            │ Row 8  │  B Page 2 (skipped)
                             ┃    ┃ └────────┘            │ Row 9  │
                             ┗━━━━┛                       └────────┘

Mask chunking uses A's finest boundary:
- At mask_start = row 3, next A page boundary = row 6
- Chunk reads rows 3–5

But Column B has 4-row pages:
- rows 0–3 in B Page 0 (fetched)
- rows 4–7 in B Page 1 (skipped)

→ rows 4–5 are in a skipped B page → invalid offset

Could go back to creating a vec of all page offsets and looking up closest page end for a mask chunk.

@alamb
Copy link
Contributor

alamb commented Jan 26, 2026

I plan to work on this PR carefully today

@alamb
Copy link
Contributor

alamb commented Jan 26, 2026

Checking it out again

/// This test creates a parquet file with multiple small pages and verifies that
/// when using Mask policy, pages that are skipped entirely are handled correctly.
#[test]
fn test_bitmask_page_aware_selection() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found these tests really hard to read as they have so much boiler plate, what they are doing is obscured by all the repetitious mechanics.

I will push a suggestion to reduce the duplication

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I thought about replacing the tests with a sync version of the #9243

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think as a follow on we should maybe split up the tests as well, as described here

/// This test creates a parquet file with multiple small pages and verifies that
/// when using Mask policy, pages that are skipped entirely are handled correctly.
#[test]
fn test_bitmask_page_aware_selection() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think as a follow on we should maybe split up the tests as well, as described here

/// Row ranges to be selected from the data source
row_selection_cursor: RowSelectionCursor,
/// Precomputed page boundary row indices for mask chunking
page_boundaries: Option<Arc<[usize]>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been digging around -- and I think the page_start_boundaries which is already calculated might be what we need. I will see if I can find some way to reuse it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you (again) for this work @sdf-jkl -- this is a very non trivial piece of work.

Some thoughts after staring at this for several hours:

I can't quite convince myself this PR is correct for all cases, especially when there are multiple different distribution of pages across columns. The tests in this PR are using two int columns with the same page limit, so it isn't clear to me that the tests cover the case when data page offsets are different between columns.

I wonder if you would be willing to help write some other tests for this case? Maybe take the regression test that @erratic-pattern made and evaluate predicates on the different columns or something.

I am also thinking maybe we can move the reader tests to parquet/tests/arrow_reader/row_filter.rs try and reduce the size of new code we adding to parquet/src/arrow/arrow_reader/mod.rs

I have gotten a good start on using page_start_boundaries though I am not quite done. If I can get that to work out I was thinking I would try and pull page_start_boundaries into a struct (so it can be better documented / easier to verify it is correct -- e.g. are the column indexes before or after projection?)

So TLDR:

  1. I think we need some more tests for predicates with multi-column chunks
  2. I think we can use page_start_boundaries with some more finagling

I am working on 2 - if I can get that going, I'll then move on to 1 if no one beats me to it.

I will refrain from pusing more commits directly to this PR -- instead I'll make PRs into it

// or until the mask is exhausted. This mirrors the behaviour of the legacy
// `RowSelector` queue-based iteration.
while cursor < mask.len() && selected_rows < batch_size {
let max_chunk_rows = page_boundaries
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the boundaries are all sorted we should be able to avoid this sort/partition point...

@sdf-jkl
Copy link
Contributor Author

sdf-jkl commented Jan 26, 2026

Thanks, I understand, this issue got me thinking about it in my sleep.

I agree that the tests with two int cols do not cover different distributions of pages. Originally, they were meant to cover the case where the mask was not page-aware at all. The new #9243 test covers that scenario and also checks different page distributions, which seemingly makes the old tests redundant.

The #9243 test covers different page distributions despite also using a page limit because one col is utf8. When building the row groups, the arrow writer is smart and will use dictionary encoding on that column. This adds a dictionary page at the beginning of the col chunk and creates an offset between pages.

@alamb
Copy link
Contributor

alamb commented Jan 26, 2026

Thanks, I understand, this issue got me thinking about it in my sleep.

This is the mark of a great software engineer, in my opinion. Just don't lose too many 💤

@alamb alamb mentioned this pull request Jan 27, 2026
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet Changes to the parquet crate

Projects

None yet

5 participants