Skip to content

Conversation

@rroelke
Copy link
Member

@rroelke rroelke commented Sep 22, 2025

Resolves CORE-321.

Today's fragment metadata contains the minimum bounding rectangle of each tile. This is very useful for determining whether tiles can satisfy spatial queries, but much less useful for determining how different tiles from different fragments may or may not interleave (knowledge of which can be used to implement several possible optimizations). This is because the bounding rectangle can under-estimate the true lower bound of the global order value in a tile, potentially by a lot.

We would like for our queries to start to leverage the interleaving of tiles from different fragments in the ways described in that linked document. First we must extend the fragment metadata to include the tile global order bounds. This pull request implements that:

  • we update the fragment metadata format version and specification to include a section with the global order lower and upper bounds for each tile in a fragment.
  • we add C and C++ APIs to query those bounds.
  • we update the writers to populate the correct values in fragment metadata and test this for the unordered writer, global order writer, and fragment consoliator.

TYPE: C_API | CPP_API | FORMAT
DESC: Add per-tile global order bounds to fragment metadata

rroelke added 30 commits July 21, 2025 13:55
…to load default profile in ordinary execution
@rroelke rroelke marked this pull request as ready for review January 27, 2026 01:53
@teo-tsirpanis
Copy link
Member

chore: Add global order min/max per tile to the fragment metadata

I would call a change to the storage format a feat: and not a chore:.

@ypatia ypatia self-requested a review January 27, 2026 07:21
@rroelke rroelke changed the title chore: Add global order min/max per tile to the fragment metadata feat: Add global order min/max per tile to the fragment metadata Jan 27, 2026
@rroelke
Copy link
Member Author

rroelke commented Jan 27, 2026

chore: Add global order min/max per tile to the fragment metadata

I would call a change to the storage format a feat: and not a chore:.

Fair enough. I imagine I wrote "chore" because there isn't really any value added yet; just the enablement of certain enhancements.

Edited the title.

@rroelke rroelke requested a review from teo-tsirpanis January 27, 2026 14:57
Copy link
Member

@ypatia ypatia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll need to commit the autogenerated tiledb-rest.capnp.{c++,h} files now that we've reverted #5734

* or if the fragment was written in a format version which does not contain
* the bounding rectangle global order bounds.
*/
std::vector<std::vector<uint8_t>> global_order_lower_bound(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just saw the vector of vectors, which is not good for performance. We should better return the raw values (in an std::pair<size_t, void*>), which both matches other fragment info C++ APIs, and does not require knowing the array schema.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C API tiledb_fragment_info_get_global_order_lower_bound copies data into the user's out-argument. The inner vector is necessary as a buffer to copy that data into.

Alternatives would include:

  • user supplying their own allocated buffer as an out-argument. Very C-like and requires either knowing the schema to size the buffers correctly, or making two calls to get the size first and then the data (as is done here already)
  • returning something like std::pair<std::vector<uint8_t>, std::vector<uint64_t>> containing the concatenated data and offset into it for each dimension.

If performance is paramount then the C API is available. I think the usability is worth the tradeoff.

@rroelke rroelke requested a review from ypatia January 30, 2026 15:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants