-
Notifications
You must be signed in to change notification settings - Fork 18
feat: Expose HTTP and gRPC Request Metadata to Python Processors #262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…d of authorization
rjenkins
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brightsparc - Thanks for the PR it's good work, we've been keen on adding the request metadata for a while! Good job on weaving these changes into the Message metadata structures, it's not well documented.
There are a few changes I'd like to make to this. In particular around how we store this request context metadata and how it's passed into the processors. I could describe the changes or we could spec them out and go back and forth but I think it's faster if I checkout this PR and push a commit so we can discuss. I've got most of them done and will complete tomorrow. We should then be able to merge this shortly after.
Brief summary of what I'm changing...
- Going to move grpc/http request metadata out into a separate Option field.
- RequestContext will be an enum with grpc and http to begin, in the future we plan to put additional types in here like kafka, for the Kafka receiver message headers and other Kafka message metadata like timestamp etc
- On the processor API side we'll mirror the RequestContext type.
Primarily goal is to keep the metadata field in the Message for internal use and message acknowledgement. We've also had some requests to add message acknowledgement for the grpc/http OTLP receivers, such that Rotel will only return a 200 after it either spools to disk locally or durably exports, for example to Kafka. Splitting the request metadata into a separate field will allow us to do both.
Will follow up tomorrow after I push a commit.
| self.schema_url = schema_url; | ||
| Ok(()) | ||
| } | ||
| #[getter] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pyo3 stuff with objects and collections gets pretty complex. Not sure if this was intentional, but this will essentially make the data immutable from the python side as the underlying object is cloned. For the sake of clarity, as an example, if you call this function and then try to add to the dict on the python side with metadata['new_key'] = "some value" or del metadata['key'], the changes will not be reflected back in Rust after the processor completes. For the most part the SDK is implemented in a manner such that you can idiomatically work with objects or collections in python and the changes are reflected back on the Rust side.
However, for this use case, it's not necessarily a problem if we choose to say this request metadata should be immutable. However, we do have this setter below which will allow a total overwrite on the map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we probably do want this to be immutable in this case. My thinking is that the metadata properties could be added as resource or span attributes in the processor but probably shouldn’t be changed - however happy to take your lead on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brightsparc. I've pushed the changes to move the request metadata into a separate struct member in the Message payload. I've also added some unit tests for the context_processor. You can run the processor sdk tests by cd'ing into the rotel_python_processor_sdk directory.
cd rotel_python_processor_sdk
cargo nextest run
For a follow up I think we should add some configuration options to the context processor so you can specify the key name to use, as well as actions such as upsert, insert, or convert similar to how the attributes processor behaves here https://github.com/streamfold/rotel/blob/main/rotel_python_processor_sdk/processors/attributes_processor.py and here https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/attributesprocessor/README.md.
We can likely merge the two features so the configuration from the attributes processor can use from_context. But we can address that in the future.
Take a look and let me know if this works well for your use case and if so I'll take a final 👀 and we can get this merged.
brightsparc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Sounds good, let me know if you want me to address the merge conflict, otherwise i'm happy for you to land this. Thanks |
@brightsparc Thank you, yes if you can fix the merge conflict in agent.rs I'll approve and we can merge this PR. |
|
Will there be a new release and docker image that includes this context request mapping soon? |
Overview
This PR exposes HTTP request headers and gRPC request metadata to Python processors, enabling them to access request context and add it as attributes to telemetry data. This feature provides the same functionality as the OpenTelemetry Collector's
include_metadatacapability.Equivalent to OpenTelemetry Collector configuration:
When
ROTEL_OTLP_HTTP_INCLUDE_METADATA=trueorROTEL_OTLP_GRPC_INCLUDE_METADATA=trueis set, specified headers/metadata are extracted by the receiver and made available to Python processors via themessage_metadataproperty onResourceSpans,ResourceMetrics, andResourceLogs.Changes
message_metadata: Optional[Dict[str, str]]field to Python SDK structs (ResourceSpans,ResourceMetrics,ResourceLogs)PythonProcessabletrait to acceptheaders: Option<HashMap<String, String>>HttpMetadataandGrpcMetadataand pass to Python processorsGrpcMetadatavariant toMessageMetadataInnerenum, sharing the sameHashMap<String, String>structure as HTTPcontext_processor.pyexample demonstrating header/metadata extraction and attribute additionDockerfile.context-processorfor easy deployment with HTTP/gRPC metadata supportConfiguration
HTTP Metadata Extraction
gRPC Metadata Extraction
Docker Deployment
The
Dockerfile.context-processorbuilds the Rust binary inside Docker and includes the context processor with HTTP/gRPC metadata support. This Dockerfile is designed to work seamlessly with SigNoz, a popular OpenTelemetry-native observability platform.Prerequisites: Ensure SigNoz is installed and running. For installation instructions, see the SigNoz Docker installation guide. The Docker image connects to SigNoz services via the
signoz-netDocker network.The Docker image uses a flexible ENTRYPOINT (
/rotel) that allows you to pass any command arguments. Configuration can be provided via:ROTEL_*environment variables (e.g.,ROTEL_OTLP_HTTP_ENDPOINT,ROTEL_CLICKHOUSE_EXPORTER_ENDPOINT)startcommandRecommended configuration:
ROTEL_OTLP_HTTP_INCLUDE_METADATA=trueROTEL_OTLP_HTTP_HEADERS_TO_INCLUDE(comma-separated, e.g.,my-custom-header,another-header)/processors/context_processor.pythat extracts headers from context and adds them as span attributesIntegration with SigNoz: When running alongside SigNoz (see SigNoz Docker installation), the container connects to the
signoz-netnetwork and forwards processed telemetry to SigNoz's OTel collector. Processed traces with header attributes will appear in the SigNoz UI athttp://localhost:8080.Docker Compose example:
Note: Python 3.14+ requires
PYO3_USE_ABI3_FORWARD_COMPATIBILITY=1environment variable (PyO3 0.24.2 supports up to Python 3.13). The Dockerfile uses Python 3.13 by default.Usage
Accessing Metadata in Python Processors
Python processors can now access HTTP headers and gRPC metadata via the
message_metadataproperty, which is a simple dictionary. The same interface works for both HTTP and gRPC:Example: Adding Headers to Resource Attributes
Alignment with OpenTelemetry Collector
This feature provides equivalent functionality to OpenTelemetry Collector's
include_metadataconfiguration:OpenTelemetry Collector:
Rotel (equivalent):
Behavior alignment:
message_metadataproperty)dict[str, str]to Python processorsImplementation
Headers/metadata flow from receiver → pipeline → Python processor:
HttpMetadataGrpcMetadataHashMap<String, String>structureHashMap<String, String>and passes to processorsmessage_metadataproperty (dict[str, str])Breaking Changes
None. This is a purely additive feature. Existing processors continue to work without modification.
Testing
http.request.header.*span attributesTesting with generate-otlp
The built-in
generate-otlptool can be used to generate test telemetry data with custom headers for testing metadata extraction:The
--include-headersflag adds example test headers (my-custom-header,another-header) that can be extracted by the context processor.Querying Results in ClickHouse
After processing, headers are added as span attributes following the
http.request.header.*naming convention. You can query them in ClickHouse:The processor adds headers as span attributes using the key format
http.request.header.<header-name>(e.g.,http.request.header.my-custom-header).Known Limitations
PYO3_USE_ABI3_FORWARD_COMPATIBILITY=1environment variable (PyO3 0.24.2 supports up to Python 3.13). The Dockerfile uses Python 3.13 by default.