Skip to content

Conversation

@Fengzdadi
Copy link
Contributor

Summary

This PR completes the ReservoirItemsUnion implementation to align with Java's functionality.

Changes

Core Logic Fixes

  • Fixed twoWayMergeInternal to correctly handle 4 merge cases

  • Added DownsampledCopy, Copy, GetImplicitSampleWeight methods

  • Added internal helpers: getValueAtPosition, insertValueAtPosition, forceIncrementItemsSeen

  • Serialization: ToSlice()and NewReservoirItemsUnionFromSlice()

  • UpdateFromRaw: Create sketch from raw components (n, k, items)

  • String(): Human-readable summary

Tests

  • 17 test cases covering merge logic, serialization, and error handling

Serialization Format

Compatible with Java V2:

  • Byte 0: preLongs (1)
  • Byte 1: serVer (2)
  • Byte 2: familyID (12)
  • Byte 3: flags
  • Bytes 4-7: maxK (uint32 LE)
  • Bytes 8+: gadget data

PS: Only V2 serialization is supported (V1 is legacy format from pre-2017).

Ref #103

…ternal logic

- Add helper methods to ReservoirItemsSketch: GetImplicitSampleWeight, Copy,
  DownsampledCopy, getValueAtPosition, insertValueAtPosition, forceIncrementItemsSeen
- Rewrite ReservoirItemsUnion.UpdateSketch with correct merge logic:
  - Downsample input if K > maxK
  - createNewGadget preserves source K in sampling mode
  - twoWayMergeInternal with 4 cases (exact/weighted, with swap)
  - Correct probability formula P = (K * w) / targetTotal
- Create dedicated reservoir_items_union_test.go with Java-aligned tests
- Move union tests from reservoir_items_sketch_test.go
Copy link
Member

@proost proost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add test cases?

  1. Compatibility test with java.
  2. sketches generation test.

@Fengzdadi
Copy link
Contributor Author

Added cross-language coverage for reservoir unions (long/double/string): tests now load 27 Java-generated files (empty, exact 1/10/32/100/128, sampling maxK 32/64/128) and we also generate matching Go fixtures; go test ./... passes.

Copy link
Member

@proost proost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@proost proost merged commit ad7739d into apache:main Jan 21, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants