Skip to content

Conversation

@NJrslv
Copy link

@NJrslv NJrslv commented Jan 21, 2026

Add yagp_hooks_collector extension to Cloudberry, imported from open-gpdb/yagp_hooks_collector with full commit history. (Don't worry about # of commits and lines added, it has only 15-20 short-mid commits described below).

Structure

  1. Original commits from open-gpdb/yagp_hooks_collector
  2. Subtree import (8fb4a8e)
  3. Adaptation commits (listed below)

Most changes are ports from open-gpdb/gpdb/gpcontrib/yagp_hooks_collector, adapted for Cloudberry (see 7c6f24d for details).

DONT SQUASH OR REBASE this pr, it's only for the sake of getting feedback, then, I am going to merge yagp-hooks-collector branch, so the history stays.


  1. Fixes that are not present in open-gpdb/gpdb/gpcontrib/yagp_hooks_collector:
2680062 [yagp_hooks_collector] Fix warnings and error them

Cloudberry builds treat compiler warnings as errors. For consistency,
this behavior has been enabled in yagp_hooks_collector.

This commit also fixes the warnings in yagp_hooks_collector.

5f0eb82 [yagp_hooks_collector] Make gen of norm plan/query noexcept

The extension generates normalized query text and plan using jumbling functions.
Those functions may fail when translating to wide character if the current locale
cannot handle the character set.

Fix changes functions that generate normalized query text/plan to noexcept versions
so we can check if error occured and continute execution.

The test checks that even when those functions fail, the plan is still executed.
This test is partially taken from src/test/regress/gp_locale.sql.


  1. Port fixes and enhancements from open-gpdb/gpdb/gpcontrib/yagp_hooks_collector:
90803d1 [yagp_hooks_collector] Refactor

Similarly to [1].

[1] open-gpdb/gpdb@bdfabca#diff-99e92ab48c7310c149ea2bd3414a20a1edf31f26823253906aae1981e088775aR39-R79

38dd155 [yagp_hooks_collector] Correct add of bytes sent

Copy of [1] - send() may return -1 in case of an error, do not add -1 to total_bytes sent.

[1] open-gpdb/gpdb@e1f6c08

00dff37 [yagp_hooks_collector] Add test for UDS sending

Copy of [1] with additinal changed needed for Clouberry are described below:

The testing C functions have changed to set-returning ones if comparing with
[1] because we need a control over the place where function is executed -
either on master or segments, and in Cloudberry these functions must return
set of values so they were changed to return SETOF.

[1] open-gpdb/gpdb@989ca06

7ab24ed [yagp_hooks_collector] Add submit & done hooks

Similarly to [1] add missing executor query info hooks.

[1] open-gpdb/gpdb@87fc05d

f3015f7 [yagp_hooks_collector] Add consistent filtering

Full copy of [1] for yagp_hooks_collector.

[1] open-gpdb/gpdb@845278f#diff-fa2654417413bbb37d47ecf1644dc5af90c76c77f2a90e05c27107967b5f6fd8


  1. Correction of some typos:
7cff09c [yagp_hooks_collector] Correct tokens from gram.y

Correct defines for token ids copied from gram.y.

b0b2e7b [yagp_hooks_collector] Add comments for func args

  1. Port, Add test gpcontrib-yagp-hooks-collector to CI, change Makefile to compile the extension with cloudberry along with configure option to enable it --with-yagp_hooks_collector:
9be5f14 [yagp_hooks_collector] Change greenplum_path.sh to cloudberry-env.sh

Correct CI for yagp_hooks_collector to use correct env script.

8f0bf5f [yagp_hooks_collector] Add CI test and with option

Change makefile, test and add it to CI of yagp_hooks_collector.
Add option --with-yagp-hooks-collector. Similarly to [1].

[1] open-gpdb/gpdb@7be8893


  1. Commits necessary for compilation yagp_hooks_collector in Cloudberry since some things have changed:
2afca84 [yagp_hooks_collector] Change test out for part tbl

In gpdb create table was executed for each partition. Now one single create table is executed. Thus only one create table query goes through executor. Change the test accordingly.

b7f9d2c [yagp_hooks_collector] Change test functions to SRF

In yagp_hooks_collector we need control over place where function is executed, and Cloudberry supports only set-returning functions to execute on COORDINATOR so change the type of the functions. We can see the error below without this change: ERROR: EXECUTE ON COORDINATOR is only supported for set-returning functions.

7b0654e [yagp_hooks_collector] Del redundant funcs

Remove unnecessary copies of the core jumbling functions from yagp_hooks_collector/stat_statements_parser.
In commit [1] query jumbling moved to core, thus there is no need to keep a copy of jumbling functions in yagp_hooks_collector.

[1] 5fd9dfa

7c6f24d [yagp_hooks_collector] Use updated names and func's interfaces

Update usage in yagp_hooks_collector of

  • heap_create_with_catalog()
  • standard_ExecutorRun()
  • standard_ProcessUtility()
  • InstrAlloc()
  • CreateTemplateTupleDesc()
  • ExplainInitState() -> NewExplainState()
  • gpmon_gettmid() -> gp_gettmid()
  • Gp_session_role -> Gp_role
  • strerror(errno) -> "%m"
  • Include utils/varlena.h for SplitIdentifierString() in gpdbwrappers.cpp.
af1092e [yagp_hooks_collector] Port YagpQueryState from gpdb

Copy of [1] from gpdb to create a global QueryState for unique hashing for yagp-hooks-collector.

[1] open-gpdb/gpdb@476b540

b362823 [yagp_hooks_collector] Port workfile stats from gpdb

Copy of [1] from gpdb to collect workfile stats in yagp-hooks-collector.

[1] open-gpdb/gpdb@8813a55


  1. Original copy of yagp_hooks_collector:
8fb4a8e Add 'gpcontrib/yagp_hooks_collector/' from commit 'ca620e9f75f5bc717620d529ef6753c7666c2d99'

git-subtree-dir: gpcontrib/yagp_hooks_collector
git-subtree-mainline: e7eae0d
git-subtree-split: ca620e9

Just create the overal project structure with a Makefile to
generate protobufs, compile it into a shared library extension
and install it
as a fundation to build on
- borrow GlowByte code to generate plan text and SessionInfo
- borrow code from our in-house pg_stat_statements to generate
  query id and plan id
- refactor code to follow common name conventions and identations
- do some minor refactoring to follow common naming convention
- add additional message right after ExecutorStart hook
1) Query instrumentation
2) /proc/self/* stats
It allows finer granularity than executor hooks. Also removed some code
duplication and data duplicaton
1. Initialize query instrumentation to NULL so that it can be properly
checked later (temporary solution, need to find a proper fix)
2. Don't collect spillinfo on query end. Reason: a) it will always be
zero and b) it could crash if we failed to enlarge a spillfile. Seems
like we need some cummulative statistics for spillinfo. Need to check
what explain analyze use.
1. Sync with protobuf changes to collect segment info
2. Remove noisy logging
3. Fix some missing node types in pg_stat_statements
Reason: when query info hook is called with status 'DONE' planstate
is already deallocated by ExecutorEnd
1) Give higher gRPC timeouts to query dispatcher as losing messages
there is more critical
2) If we've failed to send a message via gRPC we notify a background
thread about it and refuse sending any new message until this thread
re-establishes the lost connection
Don't collect system queries with empty query text and ccnt == 0
Rethrowing them might break other extensions and even query execution
pipeline itself
usernamedt and others added 27 commits December 24, 2025 14:49
We inherited this issue from PostgreSQL.

PostgreSQL uses glibc to sort strings. In version glibc=2.28, collations
broke down badly (in general, there are no guarantees when updating glibc).
Changing collations breaks indexes. Similarly, a cluster with different
collations also behaves unpredictably.

What and when something has changed in glibc can be found
on https://github.com/ardentperf/glibc-unicode-sorting
Also there is special postgresql-wiki https://wiki.postgresql.org/wiki/Locale_data_changes
And you tube video https://www.youtube.com/watch?v=0E6O-V8Jato

In short, the issue can be seen through the use of bash:

( echo "1-1"; echo "11" ) | LC_COLLATE=en_US.UTF-8 sort

gives the different results in ubunru 18.04 and 22.04.

There is no way to solve the problem other than by not changing the symbol order.
We freeze symbol order and use it instead of glibc.

Here the solution https://github.com/postgredients/mdb-locales.

In this PR I have added PostgreSQL patch that replaces all glibc
locale-related calls with a calls to an external libary. It activates
using new configure parameter --with-mdblocales, which is off by
default.

Using custom locales needs libmdblocales1 package and mdb-locales
package with symbol table.

Build needs libmdblocales-dev package with headers.
* MDB admin patch & tests

This patch introcudes new pseudo-pre-defined role "mdb_admin".

Introduces 2 new function:
extern bool mdb_admin_allow_bypass_owner_checks(Oid userId,  Oid ownerId);
extern void check_mdb_admin_is_member_of_role(Oid member, Oid role);

To check mdb admin belongship and role-to-role ownership transfer
correctness.

Our mdb_admin ACL model is the following:

* Any roles user or/and roles can be granted with mdb_admin
* mdb_admin memeber can tranfser ownershup of relations,
namespaces and functions to other roles, if target role in neither:
superuser, pg_read_server_files, pg_write_server_files nor
pg_execute_server_program.

This patch allows mdb admin to tranfers ownership on non-superuser objects

* f
This commit introduces new mdb internal role mdb_superuser.

Role is capaple of:

GRANT/REVOKE any set of priviledges to/from any object in database.
Has power of pg_database_owner in any database, including:
DROP any object in database (except system catalog and stuff)

Role is NOT capaple of:

Create database, role, extension or alter other roles with such
priviledges.

Transfer ownership to /pass has_priv of roles:

PG_READ_ALL_DATA
PG_WRITE_ALL_DATA
PG_EXECUTE_SERVER_PROGRAM
PG_READ_SERVER_FILES
PG_WRITE_SERVER_FILES
PG_DATABASE_OWNER

Fix configure.ac USE_MDBLOCALES option handling

Apply autoreconf stuff

Set missing ok parameter ito true while acquiring mdb_superuser oid

In regress tests, nobody creates mdb_superuser role, so missing ok is
fine

Allow mdb_superuser to have power of pg_database_owner

Allow mdb_superuser to alter objects and grant ACl to
objects, owner by pg_database_owner. Also, when acl check,
allow mdb_supersuer use pg_database_owner role power to pass check
…20d529ef6753c7666c2d99'

git-subtree-dir: gpcontrib/yagp_hooks_collector
git-subtree-mainline: e7eae0d
git-subtree-split: ca620e9
Copy of [1] from gpdb to collect workfile stats in yagp-hooks-collector.

[1] open-gpdb/gpdb@8813a55
Copy of [1] from gpdb to create a global QueryState for unique hashing for yagp-hooks-collector.

[1] open-gpdb/gpdb@476b540
Update usage in yagp_hooks_collector of
- heap_create_with_catalog()
- standard_ExecutorRun()
- standard_ProcessUtility()
- InstrAlloc()
- CreateTemplateTupleDesc()
- ExplainInitState() -> NewExplainState()
- gpmon_gettmid() ->  gp_gettmid()
- Gp_session_role -> Gp_role
- strerror(errno) -> "%m"
- Include utils/varlena.h for SplitIdentifierString() in gpdbwrappers.cpp.
Remove unnecessary copies of the core jumbling functions from yagp_hooks_collector/stat_statements_parser.
In commit [1] query jumbling moved to core, thus there is no need to keep a copy of jumbling functions in yagp_hooks_collector.

[1] 5fd9dfa
In yagp_hooks_collector we need control over place where function is executed, and Cloudberry supports only set-returning functions to execute on COORDINATOR so change the type of the functions. We can see the error below without this change: ERROR:  EXECUTE ON COORDINATOR is only supported for set-returning functions.
In gpdb create table was executed for each partition. Now one single create table is executed. Thus only one create table query goes through executor. Change the test accordingly.
Change makefile, test and add it to CI of yagp_hooks_collector.
Add option --with-yagp-hooks-collector. Similarly to [1].

[1] open-gpdb/gpdb@7be8893
Correct CI for yagp_hooks_collector to use correct env script.
Correct defines for token ids copied from gram.y.
Similarly to [1] add missing executor query info hooks.

[1] open-gpdb/gpdb@87fc05d
Copy of [1] with additinal changed needed for Clouberry are described below:

The testing C functions have changed to set-returning ones if comparing with
[1] because we need a control over the place where function is executed -
either on master or segments, and in Cloudberry these functions must return
set of values so they were changed to return SETOF.

[1] open-gpdb/gpdb@989ca06
Copy of [1] - send() may return -1 in case of an error, do not add -1 to total_bytes sent.

[1] open-gpdb/gpdb@e1f6c08
The extension generates normalized query text and plan using jumbling functions.
Those functions may fail when translating to wide character if the current locale
cannot handle the character set.

Fix changes functions that generate normalized query text/plan to noexcept versions
so we can check if error occured and continute execution.

The test checks that even when those functions fail, the plan is still executed.
This test is partially taken from src/test/regress/gp_locale.sql.
Cloudberry builds treat compiler warnings as errors. For consistency,
this behavior has been enabled in yagp_hooks_collector.

This commit also fixes the warnings in yagp_hooks_collector.
Copy link

@leborchuk leborchuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@NJrslv NJrslv merged commit cc7ebac into main Jan 23, 2026
41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants