Skip to content

Conversation

@jahough
Copy link
Contributor

@jahough jahough commented May 31, 2025

Context

This PR is intended to resolve issue #54.

Applications that have lengthy batch operations can result in noisy span creation.
This can bloat the overall trace and potentially even cause traces to be dropped at ingestion time due to their length/size. Though, the latter point is the extreme - it definitely is dependent on user's infrastructure configuration.

Change

Add a new option that allows users to specify a distinct TracerProvider that will be specifically used for these per-query span creations.

This offers users the flexibility to sample these spans however they wish using the OpenTelemetry SDK's Sampler type.
For example, users can choose to NeverSample, AlwaysSample, or even sample a percentage of these spans. It is entirely up to them.

In the event users do not provide this new option, backwards compatibility with WithTracerProvider is maintained.
The default will be to use the TracerProvider assigned here, or otherwise, the same existing OpenTelemetry SDK's global TracerProvider.

Option not provided
otelpgx.NewTracer()

Screenshot 2025-05-30 at 9 51 31 PM

Option provided with a TracerProvider configured to never sample
otelpgx.NewTracer(
    otelpgx.WithBatchQueryTracerProvider(trace.NewTracerProvider(trace.WithSampler(trace.NeverSample()))),
)

Screenshot 2025-05-30 at 9 53 05 PM

Note that there are no longer any query INSERT spans, yet the batch size attribute remains at 10.

@jahough jahough force-pushed the batch-span-custom-tracerprovider branch from 0092558 to 4a56273 Compare May 31, 2025 03:18
Copy link
Member

@obitech obitech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@obitech obitech requested a review from costela June 2, 2025 15:25
@obitech
Copy link
Member

obitech commented Jun 2, 2025

@jahough could you rebase on main so it runs the pipeline?

@jahough jahough force-pushed the batch-span-custom-tracerprovider branch from 4a56273 to 8122891 Compare June 2, 2025 15:29
@costela
Copy link
Member

costela commented Jun 2, 2025

Thanks for the contribution! 🚀

Just to be sure we don't try to kill a fly with a bazooka: how about something like #59 to solve the specific issue of batches vs individual queries? I'm a bit weary of too much flexibility, since that can also make using the API a bit more difficult.

@costela
Copy link
Member

costela commented Jun 2, 2025

To keep some of the flexibility, we could extend #59 and export a helper function to determine if we're in a batch context, so the consumer can use the existing WithTracerProvider plus a custom Sampler and decide on a sample depending on whether ParentContext is a batch or not.

That would keep this somewhat advanced use-case behind a bit more code, but an existing API.

WDYT?

@jahough
Copy link
Contributor Author

jahough commented Jun 3, 2025

Thanks for the contribution! 🚀

Just to be sure we don't try to kill a fly with a bazooka: how about something like #59 to solve the specific issue of batches vs individual queries? I'm a bit weary of too much flexibility, since that can also make using the API a bit more difficult.

Apologies for the delay - got tied up with things yesterday.
Anyhow... Totally understand. I was also on the fence with where we should be drawing the line for this.

I'm personally fine if we want to go with a more binary on/off style approach.
In terms of viewing the traces themselves, perhaps in some ways this is "clearer"?
For instance, with a batch size of n=10 and seeing only some individual queries, to the untrained eye that could indicate we really should be seeing n-many sub-queries... however, with sampling some of these individual queries will have "disappeared" and would leave gaps in the overall timeline visualization. To the point where one might start to question what is occupying the latter half of the batch processing.
Whereas, on/off approach certainly doesn't provide any doubt - the "batch start" span would simply last X amount of time before moving onto next span in the trace.

Conversely, to play the other side... if a user wants to see specific attributes values used within some of the queries, and/or they want to see some of the individual per query durations, then this is where having a means to sample would be helpful.

All that to say, if we can offer both options (through what you're suggesting below) with the same implementation, then I think all of ^ is best left for the user to decide what they want to do and what caveats they want to accept.

To keep some of the flexibility, we could extend #59 and export a helper function to determine if we're in a batch context, so the consumer can use the existing WithTracerProvider plus a custom Sampler and decide on a sample depending on whether ParentContext is a batch or not.

That would keep this somewhat advanced use-case behind a bit more code, but an existing API.

WDYT?

I'd be open to this. I think I like the sound of keeping a single TracerProvider throughout better.
If we were to extend #59 with this ability, would we still need the explicit skipBatchQueries configuration?
The same effect could be achieved by providing a Sampler configured not to sample anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants