Skip to content

Conversation

@AgraVator
Copy link
Contributor

No description provided.

@markdroth markdroth changed the title A110: Child Channel Plugins A110: Child Channel Options Dec 24, 2025
Copy link
Member

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall design looks good, but I think the doc needs work.

My main concern is that we need a much better description of the overall design in a language-agnostic way, rather than just saying that we have a different design for each language. It's true that each language has its own APIs for how per-channel options are set, but the overall goal here is still to have a way to pass a set of options to be used in child channels. The semantics for that should be the same in all languages.

Please let me know if you have any questions. Thanks!

@@ -0,0 +1,215 @@
# **A110: Child Channel Configuration Plugins**

* **Author(s)**: [Abhishek Agrawal](mailto:agrawalabhi@google.com)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple of fields missing here, in particular the "approver" and "mailing list link" fields.

Please make sure you are following the gRFC template: https://github.com/grpc/proposal/blob/master/GRFC-TEMPLATE.md


## **Abstract**

This proposal introduces a mechanism to configure "child channels"—channels created internally by gRPC components (such as xDS control plane channel). Currently, these internal channels cannot easily inherit configuration (like metric sinks and interceptors) from the user application without relying on global state. This design proposes a language-specific approach to injecting configuration: using `functional interfaces` in Java, `DialOptions` in Go, and `ChannelArgs` in Core.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to specify a different approach in each language. I think what we want here is to describe an approach in a language-agnostic way, which we will then use in all languages.

To be clear, I'm fine with specifying details of how things will work in individual languages if that is helpful to clarify the design. But the main goal of an "A"-series gRFC is to specify a feature that will work essentially the same way in all languages, and we need to approach it from that perspective.

You may want to look at several of the recent "A"-series gRFCs for examples.

Complex gRPC ecosystems often require the creation of auxiliary channels that are not directly instantiated by the user application. Two primary examples are:

1. **xDS (Extensible Discovery Service)**: When a user creates a channel with an xDS target, the gRPC library internally creates a separate channel to communicate with the xDS control plane. Currently, users have limited ability to configure this internal control plane channel.
2. **Advanced Load Balancing (RLS, GrpcLB):** Policies like RLS (Route Lookup Service) and GrpcLB, as well as other high-level libraries built on top of gRPC, frequently create internal channels to communicate with look-aside load balancers or backends.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RLS isn't a public API and shouldn't be mentioned here.

I think better examples to cite here are ext_authz (see WIP gRFC A92 in #481) and ext_proc (see WIP gRFC A93 in #484). Note that those cases are not LB policies, so we should consider whether the plumbing needs to be different to support those cases.

@@ -0,0 +1,215 @@
# **A110: Child Channel Configuration Plugins**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be called "Child Channel Options". The mechanism we're describing here isn't really a plugin thing.


There is currently no standardized way to configure behavior for these child channels.

* **Metrics**: Users need to configure metric sinks so that telemetry from internal channels can be read and exported.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me what this list is trying to communicate, because each of the three items here is something completely different:

  • Metrics are one of the main features supported by stats plugins, but if that's what you're talking about, then I would expect you to also mention tracing.
  • I don't think we need to talk about interceptors specifically here, since the mechanism we're describing isn't really specifically about interceptors; it's about setting options for child channels in general.
  • The "no global state" thing isn't even talking about state that may be propagated to child channels; it's talking about why we can't use global state as a mechanism for how to do this propagation. I think that should just be a separate paragraph.

I think we should instead start this section by talking about the problem that actually motivated this design, which is the need to set StatsPlugins on a per-channel basis and also propagate them to child channels. We can then discuss why it's not sufficient to just use the global registry.

This discussion should reference gRFCs A66 and A72, which describe StatsPlugins.

// The child_args_struct is wrapped as a pointer argument.
// We use a VTable to ensure the nested struct is safely copied or
// destroyed during the parent channel's lifetime.
grpc_arg parent_arg = grpc_channel_arg_pointer_create(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is using a C-core API, which is not what we'd want applications to use.

Instead, we would probably just add a method to grpc::ChannelArguments that looks something like this:

void SetChildChannelArgs(grpc::ChannelArguments child_args);

auto channel = grpc::CreateCustomChannel(
"xds:///my-service",
grpc::InsecureChannelCredentials(),
grpc::ChannelArguments::FromC(parent_args)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no FromC() method on grpc::ChannelArguments, although that method does exist on some of our internal channel args types. This looks an awful lot like an AI hallucination. ;P


### **Why Language-Specific Approaches?**

While the concept is cross-language, the mechanisms for channel creation differs significantly:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These differences are really just implementation details. I think the overall approach is the same across languages, and we should describe it as such.


### **Java**

In Java, the configuration will be achieved by accepting functions (callbacks). The API allows users to pass a `Consumer<ManagedChannelBuilder<?>>` (or a similar functional interface). When an internal library (e.g., xDS, RLS, gRPCLB) creates a child channel, it applies this user-provided function to the builder before building the channel.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API proposed for Java and Go seems to be only for channels, not for servers. However, the xDS, ext_authz, and ext_proc cases will also happen on the gRPC server side. Do we need to solve this problem also on the server side?

(The C-core approach will work on both client and server sides.)


Complex gRPC ecosystems often require the creation of auxiliary channels that are not directly instantiated by the user application. Two primary examples are:

1. **xDS (Extensible Discovery Service)**: When a user creates a channel with an xDS target, the gRPC library internally creates a separate channel to communicate with the xDS control plane. Currently, users have limited ability to configure this internal control plane channel.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the xDS case, the same xDS channel will be used in multiple channels for the same target. We need to document what the behavior will be if the application creates multiple channels to the same target with different child channel args. (It will just use whatever child channel args are passed to the first channel to be created.)

@AgraVator AgraVator requested a review from markdroth December 29, 2025 15:56

* StatsPlugins & Tracing: Users need to configure metric sinks (as described in gRFC [A66](https://github.com/grpc/proposal/blob/master/A66-otel-stats.md) and [A72](https://github.com/grpc/proposal/blob/master/A72-open-telemetry-tracing.md)) so that telemetry from internal channels is correctly tagged and exported.
* Interceptors: Users may need to apply specific interceptors (e.g., for authentication, logging, or tracing) to internal traffic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About interceptors for authentication - since the same configuration will be applied to all child channels, but something like auth tokens is going to be different per server like xDs or the DNS servers used by gRPC LB, i.e the same auth tokens is likely not going to work with both.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true for most options. I don't see any reason that the options for the xDS management server and a service like ExtProc/ExtAuthz must share the same set of interceptors, stats configuration, and other channel options (I'm also interested in setting client keepalive settings for the management server, hopefully this design address this as well -- I opened grpc/grpc#39187 a while back, happy to give you more context if needed).

The access patterns of those auxiliary channels have reasons to be different. If we can make this design account for that, I think it would make it more useful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we should remove authentication from this list. I think we would strongly discourage injecting interceptors into child channels for authentication. We already deal with authentication in other ways anyway; that was essential for getting any child channel working. Also, I'd say it is clearly improper to assume what the child channel is doing. Logging/tracing/metrics can be added independent of what the channel is being used for; the configuration doesn't have to assume "oh, this child channel is for the xds server, let's use configuration X."

I agree that the child channels can have different configuration from their parent.

If you want to enable keepalive for child channel X but not Y, then we'd have to build in some way to identify the child channel. The design right now is targeting "same configuration for all children of this particular channel," which would probably "work" for keepalive, although technically it may be assuming a particular usage. Bootstrap does still seem a better option for the keepalive, as you really want to say "all connections to this control plane should use this setting."


## Background

Complex gRPC ecosystems often require the creation of auxiliary channels that are not directly instantiated by the user application. Two primary examples are:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Two/Three/


* StatsPlugins & Tracing: Users need to configure metric sinks (as described in gRFC [A66](https://github.com/grpc/proposal/blob/master/A66-otel-stats.md) and [A72](https://github.com/grpc/proposal/blob/master/A72-open-telemetry-tracing.md)) so that telemetry from internal channels is correctly tagged and exported.
* Interceptors: Users may need to apply specific interceptors (e.g., for authentication, logging, or tracing) to internal traffic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we should remove authentication from this list. I think we would strongly discourage injecting interceptors into child channels for authentication. We already deal with authentication in other ways anyway; that was essential for getting any child channel working. Also, I'd say it is clearly improper to assume what the child channel is doing. Logging/tracing/metrics can be added independent of what the channel is being used for; the configuration doesn't have to assume "oh, this child channel is for the xds server, let's use configuration X."

I agree that the child channels can have different configuration from their parent.

If you want to enable keepalive for child channel X but not Y, then we'd have to build in some way to identify the child channel. The design right now is targeting "same configuration for all children of this particular channel," which would probably "work" for keepalive, although technically it may be assuming a particular usage. Bootstrap does still seem a better option for the keepalive, as you really want to say "all connections to this control plane should use this setting."

* StatsPlugins & Tracing: Users need to configure metric sinks (as described in gRFC [A66](https://github.com/grpc/proposal/blob/master/A66-otel-stats.md) and [A72](https://github.com/grpc/proposal/blob/master/A72-open-telemetry-tracing.md)) so that telemetry from internal channels is correctly tagged and exported.
* Interceptors: Users may need to apply specific interceptors (e.g., for authentication, logging, or tracing) to internal traffic.

These configurations cannot be set globally because different parts of an application may require different configurations, such as different metric backends or security credentials.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we just mention the reason we have this problem: libraries. Libraries want to configure their channels, but can't configure globally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is there in rationale.

2. It applies `O_child` to the configuration of `C`.

### Precedence and Merging
The Child Channel `C` typically requires some internal configuration `O_internal` (e.g., specific target URIs, bootstrap credentials, or internal User-Agent strings).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot of internal configuration. In the Java design, some would be able to be changed by O_child, but other parts not. I really don't know what this specifically is trying to call out. This set of examples here doesn't help me much, as Java wouldn't let you change the target URI or credentials, and we have no case where we're setting an internal User-Agent string.


#### Java

In Java, the configuration will be achieved by accepting functions (callbacks). The API allows users to pass a `Consumer<ManagedChannelBuilder<?>>` (or a similar functional interface). When an internal library (e.g., xDS, gRPCLB) creates a child channel, it applies this user-provided function to the builder before building the channel.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/before building/before further configuring/ to adhere to the defined O_internal behavior, right?

Use the standard `java.util.function.Consumer` and define a new public API interface, `ChildChannelConfigurer`, to encapsulate the configuration logic for auxiliary channels.

```java
import java.util.function.Consumer;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no point in having the interface if we're going to have it extend Consumer. We would just use Consumer directly. However, because it was added in Android API level 24, we can't actually use Consumer at all.

* ##### API Changes

* ManagedChannelBuilder: Add `ManagedChannelBuilder#childChannelConfigurer(ChildChannelConfigurer childChannelConfigurer)` to allow users to register this configurer.
* ServerBuilder: Add `ServerBuilder#childChannelConfigurer(ChildChannelConfigurer configurer)` to allow users to provide configuration for any internal channels created by the server (e.g., connections to external authorization or processing services).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be only for XdsServerBuilder. Child channels are not available for regular servers.


We do not propose applying child channel configurations to Out-of-Band (OOB) channels at this time. To maintain architectural flexibility and avoid breaking changes in the future, we will modify the implementation to use a `noOp()` MetricSink for OOB channels rather than inheriting the parent channel's sink.

Furthermore, we acknowledge that certain configurations will not function out-of-the-box for `resolvingOobChannel` due to its specific initialization requirements.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What were the problems here? We should call it out more; "certain configurations" could be very common, in which case that would be a problem.


* ##### Out-of-Band (OOB) Channels

We do not propose applying child channel configurations to Out-of-Band (OOB) channels at this time. To maintain architectural flexibility and avoid breaking changes in the future, we will modify the implementation to use a `noOp()` MetricSink for OOB channels rather than inheriting the parent channel's sink.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this section, since grpc/grpc-java#12589 is merged.

// TODO(AgraVator): add example for configuring child channel observability
```

## Rationale
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to say somewhere that we are not trying to provide any identity information about which channel is being created. The configuration provided by this gRFC is required to be uniform across child channels of a particular channel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants