Skip to content

Conversation

@friendlymatthew
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

This PR adds a fast path that detects single row group reads at construction time and stores the constant index value directly, avoiding Hashmap allocation and iterator overhead

For multiple row groups, the existing iterator-based approach is used unchanged

@github-actions github-actions bot added the parquet Changes to the parquet crate label Jan 20, 2026
@Dandandan
Copy link
Contributor

@friendlymatthew do you have some performance results?

@alamb
Copy link
Contributor

alamb commented Jan 26, 2026

@friendlymatthew are you able to answer @Dandandan questions about performance benchmarks?

@friendlymatthew
Copy link
Contributor Author

@friendlymatthew are you able to answer @Dandandan questions about performance benchmarks?

Hi, yes. I am afk these next two weeks so am slow to respond. But will try to get it done before then!

@alamb
Copy link
Contributor

alamb commented Jan 27, 2026

Sounds good -- thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf: optimize RowGroupIndexReader for single row group reads

3 participants