feat: Add GenAI/LLM telemetry support for ClickHouse #259

brightsparc · 2026-01-01T05:09:06Z

This PR adds support for GenAI/LLM observability workloads by extending the existing ClickHouse traces exporter with materialized fields optimized for LLM telemetry queries.

Changes

ClickHouse DDL Tool (`clickhouse-ddl`)

Added two new flags for traces table creation:

--materialize-genai-fields: Creates MATERIALIZED columns that extract GenAI semantic convention attributes from SpanAttributes, enabling efficient querying without parsing JSON on every query.
--partition-by-service-name: Optimizes multi-tenant deployments by partitioning data by (ServiceName, toDate(Timestamp)) and ordering by (ServiceName, Timestamp).

Materialized GenAI Fields

When --materialize-genai-fields is enabled, the following columns are automatically extracted:

Column	Description
`GenAIOperationName`	Operation type (e.g., "chat", "embeddings")
`GenAIProviderName`	LLM provider (e.g., "openai", "anthropic")
`GenAIRequestModel`	Requested model name
`GenAIResponseModel`	Actual model used in response
`GenAIInputTokens`	Input token count
`GenAIOutputTokens`	Output token count
`GenAISystemFingerprint`	System fingerprint for reproducibility
`GenAILastInputMessage`	Last input message (for quick access)
`GenAILastOutputMessage`	Last output message (for quick access)

Python Processors

Added genai_redaction_processor.py for redacting PII in GenAI message attributes:

gen_ai.input.messages
gen_ai.output.messages

Features:

Pattern-based redaction (emails, SSNs, phone numbers, credit cards, IPs)
Tool field redaction (passwords, API keys, secrets)
Optional SHA256 hashing instead of [REDACTED]
Debug/info/silent summary modes

Usage

Creating Tables

# Create traces table with GenAI materialized fields and multi-tenant partitioning
clickhouse-ddl create \
  --endpoint http://localhost:8123 \
  --database otel \
  --traces \
  --materialize-genai-fields \
  --partition-by-service-name

Running with PII Redaction

PYTHONPATH=/path/to/rotel_python_processor_sdk \
rotel start \
  --otlp-grpc-endpoint 0.0.0.0:4317 \
  --otlp-with-trace-processor /path/to/your_redaction_processor.py \
  --exporter clickhouse \
  --clickhouse-exporter-endpoint http://localhost:8123 \
  --clickhouse-exporter-database otel

Querying GenAI Data

-- Get recent chat completions with token usage
SELECT 
    ServiceName,
    GenAIProviderName,
    GenAIRequestModel,
    GenAIInputTokens,
    GenAIOutputTokens,
    GenAILastInputMessage,
    GenAILastOutputMessage
FROM otel.otel_traces 
WHERE GenAIOperationName = 'chat'
ORDER BY Timestamp DESC 
LIMIT 10;

-- Token usage by provider and model
SELECT 
    GenAIProviderName,
    GenAIRequestModel,
    sum(GenAIInputTokens) as total_input_tokens,
    sum(GenAIOutputTokens) as total_output_tokens,
    count() as request_count
FROM otel.otel_traces
WHERE GenAIOperationName = 'chat'
GROUP BY GenAIProviderName, GenAIRequestModel
ORDER BY total_output_tokens DESC;

Files Changed

.gitignore - Added Python cache patterns
src/bin/clickhouse-ddl/ddl_traces.rs - Added GenAI materialized columns generation
src/bin/clickhouse-ddl/main.rs - Added CLI flags
src/bin/clickhouse-ddl/README.md - Updated documentation
src/init/args.rs - Minor updates
src/exporters/clickhouse/mod.rs - Minor updates
rotel_python_processor_sdk/processors/genai_redaction_processor.py - New PII redaction processor
rotel_python_processor_sdk/python_tests/genai_redaction_processor_test.py - Tests for redaction processor

- Add --materialize-genai-fields flag to DDL tool for creating materialized columns (GenAIOperationName, GenAIProviderName, token counts, etc.) - Add --partition-by-service-name flag for multi-tenant optimization - Add genai_redaction_processor.py for PII redaction in LLM messages - Update documentation with usage examples

rjenkins

Thanks @brightsparc, did a quick skim and this looks pretty good! I'm going to dig into the Python code tomorrow and should be able to finish review.

Question on the ddl, do you know if there has been any changes on the otel collector side in regards to schema for CH exporter with genai changes? I checked here https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/exporter/clickhouseexporter/internal/sqltemplates, but doesn't look like anyone has done any work on this yet there either. Either way excited to get this in.

mheffner · 2026-01-05T15:35:48Z

Can you merge from main to include this PR: #256? That should allow us to schedule the status checks

brightsparc · 2026-01-05T17:51:11Z

Question on the ddl, do you know if there has been any changes on the otel collector side in regards to schema for CH exporter with genai changes? I checked here https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/exporter/clickhouseexporter/internal/sqltemplates, but doesn't look like anyone has done any work on this yet there either. Either way excited to get this in.

I'm not aware of any specific new fields at the table level, most of these changes are under the span attributes AFAIK.

rjenkins

@brightsparc. I left a comment on the processor in regards to overlap with the existing redaction processor, would like to hear your thoughts on that. Additionally I'm curious as to how you are using/querying the data with these changes today. Are you using ClickStack or are you directly querying ClickHouse in some other manner?

The other big change here is the schema changes to CH. Nice job on making them "opt-in" via args on clickhouse-ddl. I'm reaching out to our friends at Clickhouse to get their thoughts on these, so give me a day or two to get back to you.

From a high level essentially there are two things we should consider.

What is the performance impact on insert for the materialized columns and partition/order by changes. As mentioned it's opt-in, so not necessarily a blocker but there is likely additional load on the CH server and theoretically a drop in insert performance. Ideally we'd quantify that and make a note in the README.md about the perf impact. On the flip side the query performance will be better for aligned queries.

We can likely help with some of the perf analysis if you can share some code to generate synthetic spans with the related genai fields or perhaps even more useful we could modify our load generator https://github.com/streamfold/otel-loadgen to include these new fields and we can run some benchmarks.

Does this have an impact on ClickStack? I suspect ClickStack will ignore these additional columns, so might not be an issue but we should test and verify we don't break anything with their UI.

If we find that these additional columns are very interesting from the ClickHouse folks we may want to consider more closely how we generate these (materialized or transformed natively in Rotel's CH exporter) and whether we should add these to the default OTel traces table or if their better suited in a seperated GenAI traces table.

rjenkins · 2026-01-05T23:12:47Z

rotel_python_processor_sdk/processors/genai_redaction_processor.py

@@ -0,0 +1,463 @@
+"""


@brightsparc - Thanks for this processor, it seems like the bulk improvements here are some DX around handling redaction as it related items with the tool_call prefix within the "gen_ai.input.messages" and "gen_ai.output.messages"? This seems useful to me and we definitely want to add more out-of-the-box processors so we might land this as-is, however I'm curious if you looked at the generic redaction processor here: https://github.com/streamfold/rotel/blob/main/rotel_python_processor_sdk/processors/redaction_processor.py.

Seems like there is some fair amount of overlap. The redaction_processor and its configuration arguments are based loosely on the similar go processor. Would some of these changes slot into the existing redaction processor as additional configuration options? WDYT about the overlap and what was missing from the redaction processor that makes this genai_redaction_processor easier to use?

Yes, I wasn't sure if there was a way to handle the same sort of genai specific redaction with the existing out of the box solution. I implemented this as more of a POC to see if I could include as part of the broader GenAI materialization of properties. Also I could see value in having an option to use presidio for more sophisticated scrubbing.

I'm happy to drop this from the PR if it's not adding value.

Presidio looks neat. We don't need to drop this and agreed it's probably a good idea to keep it separate for now, (like a POC as you say), so we can iterate on it without modifying or worrying about impact or accidental breakage to the generic redaction processor. I'm just wondering if longer term the capabilities will overlap and we can consolidate. With that said, we want to grow the processor library so overlap is not major concern at the moment and if we find later users are confused as to "which processor to use" we can consolidate.

rjenkins · 2026-01-05T23:17:12Z

src/bin/clickhouse-ddl/ddl_traces.rs

-                "ORDER_BY",
-                &get_order_by("(ServiceName, SpanName, toDateTime(Timestamp))", engine),
-            ),
+            ("PARTITION_BY", &get_partition_by(partition_expr, engine)),


Note: We need to review performance impact on insert

The goal here was to be able to support multi-tenant querying of otel records, was going to do this by the service.name, but open to other/better ways of adding partioning / indexing.

rjenkins · 2026-01-05T23:17:39Z

src/bin/clickhouse-ddl/ddl_traces.rs

+// Uses SpanAttributes['key'] syntax and casts for numeric values
+const GENAI_MATERIALIZED_MAP_SQL: &str = r#"
+    -- GenAI MATERIALIZED columns (extracted from SpanAttributes Map)
+    GenAIConversationId String MATERIALIZED SpanAttributes['gen_ai.conversation.id'],


Need to review performance impact on insert.

brightsparc · 2026-01-06T00:52:04Z

We can likely help with some of the perf analysis if you can share some code to generate synthetic spans with the related genai fields or perhaps even more useful we could modify our load generator https://github.com/streamfold/otel-loadgen to include these new fields and we can run some benchmarks.

I wasn't aware of this project, I think it could be good to simulate a combination of various multiple turn, tool calling scenarios etc, the idea of adding the indexes was to be able quickly get to the last message in a multi-turn conversation instead of having to query the whole large json object.

If we find that these additional columns are very interesting from the ClickHouse folks we may want to consider more closely how we generate these (materialized or transformed natively in Rotel's CH exporter) and whether we should add these to the default OTel traces table or if their better suited in a seperated GenAI traces table.

I initially started this PR by looking to transform the objects with a genai specifc exporter, but decided to go this route instead which put the heavy lifting back onto clickhouse which felt like a more optimal approach, but would be interesting to benchmark.

rjenkins · 2026-01-06T01:03:53Z

We can likely help with some of the perf analysis if you can share some code to generate synthetic spans with the related genai fields or perhaps even more useful we could modify our load generator https://github.com/streamfold/otel-loadgen to include these new fields and we can run some benchmarks.

I wasn't aware of this project, I think it could be good to simulate a combination of various multiple turn, tool calling scenarios etc, the idea of adding the indexes was to be able quickly get to the last message in a multi-turn conversation instead of having to query the whole large json object.

If we find that these additional columns are very interesting from the ClickHouse folks we may want to consider more closely how we generate these (materialized or transformed natively in Rotel's CH exporter) and whether we should add these to the default OTel traces table or if their better suited in a seperated GenAI traces table.

I initially started this PR by looking to transform the objects with a genai specifc exporter, but decided to go this route instead which put the heavy lifting back onto clickhouse which felt like a more optimal approach, but would be interesting to benchmark.

Thanks for the context. I will follow up with more thoughts on the schema changes in the coming days. RE: the otel-loadgen, if you want to open a PR to simulate a combination of various multiple turn, tool calling scenarios that would be 👍

rjenkins · 2026-01-06T23:43:15Z

Hey @brightsparc we're +1 on landing this but would like to get some benchmarks on our side first. We have a harness to run the benches, but would you be willing to open a PR for https://github.com/streamfold/otel-loadgen that generates synthetic data for materialized columns. We can then run benchmarks on our end to see what the performance looks like, update the README.md with any relevant data and get this merged.

Also, it might be nice to do a blog post about if you're interested. We should be able to configure ClickStack to view the new materialized columns and we could discuss some use cases if you're interested in collaborating on a post.

brightsparc · 2026-01-08T05:22:23Z

Hey @brightsparc we're +1 on landing this but would like to get some benchmarks on our side first. We have a harness to run the benches, but would you be willing to open a PR for https://github.com/streamfold/otel-loadgen that generates synthetic data for materialized columns. We can then run benchmarks on our end to see what the performance looks like, update the README.md with any relevant data and get this merged.

it might be worth using a public open dataset that has a number of multi turn inputs / outputs. A good example is the sales force APIGen dataset that has 5k examples for multi turn.
https://huggingface.co/datasets/Salesforce/APIGen-MT-5k

They also have an older dataset of 60k examples that were typically single turn although with multiple tool calls.
https://huggingface.co/datasets/Salesforce/xlam-function-calling-60k

you could use this as a source to create a decent column of data that has input and output messages.

rjenkins · 2026-01-08T21:35:54Z

cc: @brightsparc streamfold/otel-loadgen#3

rjenkins · 2026-01-08T22:54:43Z

cc: @brightsparc streamfold/otel-loadgen#3

This is now merged, going to load test this branch tomorrow

rjenkins

Hi @brightsparc.

I did some end-to-end and some light perf testing on this Friday with the https://huggingface.co/datasets/Salesforce/APIGen-MT-5k data and our load testing tool. Everything worked so we can get this merged, but I'd like us to make a few changes before we do.

Could you implement the following change requests?

Support the --enable-json option. When I tested the ddl --materialize-genai-fields it's incompatible with the json column type. Ideally these should work together as the JSON column type is a big performance boost. It looks like you need to change items like the following.

SpanAttributes['gen_ai.operation.name'] -> **SpanAttributes.gen_ai.operation.name**

Write a unit test for the processor. In our last PR I added commits that demonstrate how to write unit tests for processors from the Rust side. Could you take a crack at writing on for the new processor you have here. Here's an example from the metadata PR 474c907#diff-19c63172d2682534b1050188b0486b25c5fca22d5af028e630b8f7367e76b9a2R2866
Add a note to the README.md for the clickhouse-ddl to mark the --materialize-genai-fields and --partition-by-service-name field as experimental. In particular note that the feature is "currently experimental" and we haven't tested it extensively let for performance or production usage. However, we are interested in hearing from users who want to better support these GenAI use cases and want to discuss performance and requirements.

Finally, we'd love to hear more about your use cases for LLM input/output and observability. If you have some time in the coming weeks to hop on a zoom for a quick chat we'd enjoy discussing.

Finally some perf testing notes... vvv
One thing that made it slightly difficult for us to test at scale on Friday was our load testing pipeline includes Kafka and we ran into some oversized message issues. Essentially our test harness looks like this.

loadgen->edge rotel->Kafka->gateway rotel->ClickHouse

Due to the large size of the Spans (with the LLM input/output) we needed to tune down the batch sizes in Rotel. After some fiddling and an increase in the Kafka max.message.size we could push the data through. Essentially there is batching in Rotel and also batching in the rdkafka library at the Kafka exporter and with really large spans, it's possible Rotel's batches will exceed the Kafka message size in bytes and neither rdkafka or Rotel's Kafka exporter currently have the ability to "re-split" these messages after batching.

The issue is not caused by this PR, however the large Spans induce the scenario, so it's something we're looking into and testing this was helpful. One potential approach is we might disable Rotel's batching when using the Kafka exporter and all rdkafka to just handle batching. This could also likely improve performance.

brightsparc · 2026-01-11T21:23:08Z

The issue is not caused by this PR, however the large Spans induce the scenario, so it's something we're looking into and testing this was helpful. One potential approach is we might disable Rotel's batching when using the Kafka exporter and all rdkafka to just handle batching. This could also likely improve performance.

Glad the load test was helpful, I’m wondering other is also a configuration that doesn’t use Kafka that would be optional?

I can look at the other changes you mentioned but it might take a week. So if you wanted to push the changes into the branch like the other PR I’m also happy to review.

rjenkins · 2026-01-12T17:52:37Z

The issue is not caused by this PR, however the large Spans induce the scenario, so it's something we're looking into and testing this was helpful. One potential approach is we might disable Rotel's batching when using the Kafka exporter and all rdkafka to just handle batching. This could also likely improve performance.

Glad the load test was helpful, I’m wondering other is also a configuration that doesn’t use Kafka that would be optional?

I can look at the other changes you mentioned but it might take a week. So if you wanted to push the changes into the branch like the other PR I’m also happy to review.

re: Kafka, there's no requirement to use Kafka, it's just the way we've set up our load testing framework to ensure reliable delivery.

re: This week, no worries, there is no rush. I'm pretty slammed this week too so likely won't get a chance to push any changes. Whoever gets to it first just drop a comment here and we get it landed in the coming weeks.

brightsparc force-pushed the genai branch from 8af2ea6 to fd7856f Compare January 1, 2026 05:10

rjenkins reviewed Jan 5, 2026

View reviewed changes

Merge remote-tracking branch 'upstream/main' into genai

e9b3ba9

rjenkins reviewed Jan 5, 2026

View reviewed changes

rjenkins mentioned this pull request Jan 8, 2026

add support for gen_ai attributes on traces streamfold/otel-loadgen#3

Merged

rjenkins requested changes Jan 11, 2026

View reviewed changes

feat: Add GenAI/LLM telemetry support for ClickHouse #259

Are you sure you want to change the base?

feat: Add GenAI/LLM telemetry support for ClickHouse #259

Uh oh!

Conversation

brightsparc commented Jan 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

ClickHouse DDL Tool (clickhouse-ddl)

Materialized GenAI Fields

Python Processors

Usage

Creating Tables

Running with PII Redaction

Querying GenAI Data

Files Changed

Uh oh!

rjenkins left a comment

Choose a reason for hiding this comment

Uh oh!

mheffner commented Jan 5, 2026

Uh oh!

brightsparc commented Jan 5, 2026

Uh oh!

rjenkins left a comment

Choose a reason for hiding this comment

Uh oh!

rjenkins Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

brightsparc Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

rjenkins Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

rjenkins Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

brightsparc Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

rjenkins Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

brightsparc commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rjenkins commented Jan 6, 2026

Uh oh!

rjenkins commented Jan 6, 2026

Uh oh!

brightsparc commented Jan 8, 2026

Uh oh!

rjenkins commented Jan 8, 2026

Uh oh!

rjenkins commented Jan 8, 2026

Uh oh!

rjenkins left a comment

Choose a reason for hiding this comment

Uh oh!

brightsparc commented Jan 11, 2026

Uh oh!

rjenkins commented Jan 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

brightsparc commented Jan 1, 2026 •

edited

Loading

ClickHouse DDL Tool (`clickhouse-ddl`)

brightsparc commented Jan 6, 2026 •

edited

Loading