-
Notifications
You must be signed in to change notification settings - Fork 250
A110: Child Channel Options #529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
markdroth
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overall design looks good, but I think the doc needs work.
My main concern is that we need a much better description of the overall design in a language-agnostic way, rather than just saying that we have a different design for each language. It's true that each language has its own APIs for how per-channel options are set, but the overall goal here is still to have a way to pass a set of options to be used in child channels. The semantics for that should be the same in all languages.
Please let me know if you have any questions. Thanks!
A110-child-channel-plugins.md
Outdated
| @@ -0,0 +1,215 @@ | |||
| # **A110: Child Channel Configuration Plugins** | |||
|
|
|||
| * **Author(s)**: [Abhishek Agrawal](mailto:agrawalabhi@google.com) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a couple of fields missing here, in particular the "approver" and "mailing list link" fields.
Please make sure you are following the gRFC template: https://github.com/grpc/proposal/blob/master/GRFC-TEMPLATE.md
A110-child-channel-plugins.md
Outdated
|
|
||
| ## **Abstract** | ||
|
|
||
| This proposal introduces a mechanism to configure "child channels"—channels created internally by gRPC components (such as xDS control plane channel). Currently, these internal channels cannot easily inherit configuration (like metric sinks and interceptors) from the user application without relying on global state. This design proposes a language-specific approach to injecting configuration: using `functional interfaces` in Java, `DialOptions` in Go, and `ChannelArgs` in Core. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want to specify a different approach in each language. I think what we want here is to describe an approach in a language-agnostic way, which we will then use in all languages.
To be clear, I'm fine with specifying details of how things will work in individual languages if that is helpful to clarify the design. But the main goal of an "A"-series gRFC is to specify a feature that will work essentially the same way in all languages, and we need to approach it from that perspective.
You may want to look at several of the recent "A"-series gRFCs for examples.
A110-child-channel-plugins.md
Outdated
| Complex gRPC ecosystems often require the creation of auxiliary channels that are not directly instantiated by the user application. Two primary examples are: | ||
|
|
||
| 1. **xDS (Extensible Discovery Service)**: When a user creates a channel with an xDS target, the gRPC library internally creates a separate channel to communicate with the xDS control plane. Currently, users have limited ability to configure this internal control plane channel. | ||
| 2. **Advanced Load Balancing (RLS, GrpcLB):** Policies like RLS (Route Lookup Service) and GrpcLB, as well as other high-level libraries built on top of gRPC, frequently create internal channels to communicate with look-aside load balancers or backends. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RLS isn't a public API and shouldn't be mentioned here.
I think better examples to cite here are ext_authz (see WIP gRFC A92 in #481) and ext_proc (see WIP gRFC A93 in #484). Note that those cases are not LB policies, so we should consider whether the plumbing needs to be different to support those cases.
A110-child-channel-plugins.md
Outdated
| @@ -0,0 +1,215 @@ | |||
| # **A110: Child Channel Configuration Plugins** | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be called "Child Channel Options". The mechanism we're describing here isn't really a plugin thing.
A110-child-channel-plugins.md
Outdated
|
|
||
| There is currently no standardized way to configure behavior for these child channels. | ||
|
|
||
| * **Metrics**: Users need to configure metric sinks so that telemetry from internal channels can be read and exported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me what this list is trying to communicate, because each of the three items here is something completely different:
- Metrics are one of the main features supported by stats plugins, but if that's what you're talking about, then I would expect you to also mention tracing.
- I don't think we need to talk about interceptors specifically here, since the mechanism we're describing isn't really specifically about interceptors; it's about setting options for child channels in general.
- The "no global state" thing isn't even talking about state that may be propagated to child channels; it's talking about why we can't use global state as a mechanism for how to do this propagation. I think that should just be a separate paragraph.
I think we should instead start this section by talking about the problem that actually motivated this design, which is the need to set StatsPlugins on a per-channel basis and also propagate them to child channels. We can then discuss why it's not sufficient to just use the global registry.
This discussion should reference gRFCs A66 and A72, which describe StatsPlugins.
A110-child-channel-plugins.md
Outdated
| // The child_args_struct is wrapped as a pointer argument. | ||
| // We use a VTable to ensure the nested struct is safely copied or | ||
| // destroyed during the parent channel's lifetime. | ||
| grpc_arg parent_arg = grpc_channel_arg_pointer_create( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is using a C-core API, which is not what we'd want applications to use.
Instead, we would probably just add a method to grpc::ChannelArguments that looks something like this:
void SetChildChannelArgs(grpc::ChannelArguments child_args);
A110-child-channel-plugins.md
Outdated
| auto channel = grpc::CreateCustomChannel( | ||
| "xds:///my-service", | ||
| grpc::InsecureChannelCredentials(), | ||
| grpc::ChannelArguments::FromC(parent_args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no FromC() method on grpc::ChannelArguments, although that method does exist on some of our internal channel args types. This looks an awful lot like an AI hallucination. ;P
A110-child-channel-plugins.md
Outdated
|
|
||
| ### **Why Language-Specific Approaches?** | ||
|
|
||
| While the concept is cross-language, the mechanisms for channel creation differs significantly: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These differences are really just implementation details. I think the overall approach is the same across languages, and we should describe it as such.
A110-child-channel-plugins.md
Outdated
|
|
||
| ### **Java** | ||
|
|
||
| In Java, the configuration will be achieved by accepting functions (callbacks). The API allows users to pass a `Consumer<ManagedChannelBuilder<?>>` (or a similar functional interface). When an internal library (e.g., xDS, RLS, gRPCLB) creates a child channel, it applies this user-provided function to the builder before building the channel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API proposed for Java and Go seems to be only for channels, not for servers. However, the xDS, ext_authz, and ext_proc cases will also happen on the gRPC server side. Do we need to solve this problem also on the server side?
(The C-core approach will work on both client and server sides.)
A110-child-channel-plugins.md
Outdated
|
|
||
| Complex gRPC ecosystems often require the creation of auxiliary channels that are not directly instantiated by the user application. Two primary examples are: | ||
|
|
||
| 1. **xDS (Extensible Discovery Service)**: When a user creates a channel with an xDS target, the gRPC library internally creates a separate channel to communicate with the xDS control plane. Currently, users have limited ability to configure this internal control plane channel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the xDS case, the same xDS channel will be used in multiple channels for the same target. We need to document what the behavior will be if the application creates multiple channels to the same target with different child channel args. (It will just use whatever child channel args are passed to the first channel to be created.)
|
|
||
| * StatsPlugins & Tracing: Users need to configure metric sinks (as described in gRFC [A66](https://github.com/grpc/proposal/blob/master/A66-otel-stats.md) and [A72](https://github.com/grpc/proposal/blob/master/A72-open-telemetry-tracing.md)) so that telemetry from internal channels is correctly tagged and exported. | ||
| * Interceptors: Users may need to apply specific interceptors (e.g., for authentication, logging, or tracing) to internal traffic. | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About interceptors for authentication - since the same configuration will be applied to all child channels, but something like auth tokens is going to be different per server like xDs or the DNS servers used by gRPC LB, i.e the same auth tokens is likely not going to work with both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is true for most options. I don't see any reason that the options for the xDS management server and a service like ExtProc/ExtAuthz must share the same set of interceptors, stats configuration, and other channel options (I'm also interested in setting client keepalive settings for the management server, hopefully this design address this as well -- I opened grpc/grpc#39187 a while back, happy to give you more context if needed).
The access patterns of those auxiliary channels have reasons to be different. If we can make this design account for that, I think it would make it more useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we should remove authentication from this list. I think we would strongly discourage injecting interceptors into child channels for authentication. We already deal with authentication in other ways anyway; that was essential for getting any child channel working. Also, I'd say it is clearly improper to assume what the child channel is doing. Logging/tracing/metrics can be added independent of what the channel is being used for; the configuration doesn't have to assume "oh, this child channel is for the xds server, let's use configuration X."
I agree that the child channels can have different configuration from their parent.
If you want to enable keepalive for child channel X but not Y, then we'd have to build in some way to identify the child channel. The design right now is targeting "same configuration for all children of this particular channel," which would probably "work" for keepalive, although technically it may be assuming a particular usage. Bootstrap does still seem a better option for the keepalive, as you really want to say "all connections to this control plane should use this setting."
A110-child-channel-plugins.md
Outdated
|
|
||
| ## Background | ||
|
|
||
| Complex gRPC ecosystems often require the creation of auxiliary channels that are not directly instantiated by the user application. Two primary examples are: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/Two/Three/
|
|
||
| * StatsPlugins & Tracing: Users need to configure metric sinks (as described in gRFC [A66](https://github.com/grpc/proposal/blob/master/A66-otel-stats.md) and [A72](https://github.com/grpc/proposal/blob/master/A72-open-telemetry-tracing.md)) so that telemetry from internal channels is correctly tagged and exported. | ||
| * Interceptors: Users may need to apply specific interceptors (e.g., for authentication, logging, or tracing) to internal traffic. | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we should remove authentication from this list. I think we would strongly discourage injecting interceptors into child channels for authentication. We already deal with authentication in other ways anyway; that was essential for getting any child channel working. Also, I'd say it is clearly improper to assume what the child channel is doing. Logging/tracing/metrics can be added independent of what the channel is being used for; the configuration doesn't have to assume "oh, this child channel is for the xds server, let's use configuration X."
I agree that the child channels can have different configuration from their parent.
If you want to enable keepalive for child channel X but not Y, then we'd have to build in some way to identify the child channel. The design right now is targeting "same configuration for all children of this particular channel," which would probably "work" for keepalive, although technically it may be assuming a particular usage. Bootstrap does still seem a better option for the keepalive, as you really want to say "all connections to this control plane should use this setting."
| * StatsPlugins & Tracing: Users need to configure metric sinks (as described in gRFC [A66](https://github.com/grpc/proposal/blob/master/A66-otel-stats.md) and [A72](https://github.com/grpc/proposal/blob/master/A72-open-telemetry-tracing.md)) so that telemetry from internal channels is correctly tagged and exported. | ||
| * Interceptors: Users may need to apply specific interceptors (e.g., for authentication, logging, or tracing) to internal traffic. | ||
|
|
||
| These configurations cannot be set globally because different parts of an application may require different configurations, such as different metric backends or security credentials. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we just mention the reason we have this problem: libraries. Libraries want to configure their channels, but can't configure globally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is there in rationale.
| 2. It applies `O_child` to the configuration of `C`. | ||
|
|
||
| ### Precedence and Merging | ||
| The Child Channel `C` typically requires some internal configuration `O_internal` (e.g., specific target URIs, bootstrap credentials, or internal User-Agent strings). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a lot of internal configuration. In the Java design, some would be able to be changed by O_child, but other parts not. I really don't know what this specifically is trying to call out. This set of examples here doesn't help me much, as Java wouldn't let you change the target URI or credentials, and we have no case where we're setting an internal User-Agent string.
A110-child-channel-plugins.md
Outdated
|
|
||
| #### Java | ||
|
|
||
| In Java, the configuration will be achieved by accepting functions (callbacks). The API allows users to pass a `Consumer<ManagedChannelBuilder<?>>` (or a similar functional interface). When an internal library (e.g., xDS, gRPCLB) creates a child channel, it applies this user-provided function to the builder before building the channel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/before building/before further configuring/ to adhere to the defined O_internal behavior, right?
| Use the standard `java.util.function.Consumer` and define a new public API interface, `ChildChannelConfigurer`, to encapsulate the configuration logic for auxiliary channels. | ||
|
|
||
| ```java | ||
| import java.util.function.Consumer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no point in having the interface if we're going to have it extend Consumer. We would just use Consumer directly. However, because it was added in Android API level 24, we can't actually use Consumer at all.
A110-child-channel-plugins.md
Outdated
| * ##### API Changes | ||
|
|
||
| * ManagedChannelBuilder: Add `ManagedChannelBuilder#childChannelConfigurer(ChildChannelConfigurer childChannelConfigurer)` to allow users to register this configurer. | ||
| * ServerBuilder: Add `ServerBuilder#childChannelConfigurer(ChildChannelConfigurer configurer)` to allow users to provide configuration for any internal channels created by the server (e.g., connections to external authorization or processing services). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would be only for XdsServerBuilder. Child channels are not available for regular servers.
A110-child-channel-plugins.md
Outdated
|
|
||
| We do not propose applying child channel configurations to Out-of-Band (OOB) channels at this time. To maintain architectural flexibility and avoid breaking changes in the future, we will modify the implementation to use a `noOp()` MetricSink for OOB channels rather than inheriting the parent channel's sink. | ||
|
|
||
| Furthermore, we acknowledge that certain configurations will not function out-of-the-box for `resolvingOobChannel` due to its specific initialization requirements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What were the problems here? We should call it out more; "certain configurations" could be very common, in which case that would be a problem.
A110-child-channel-plugins.md
Outdated
|
|
||
| * ##### Out-of-Band (OOB) Channels | ||
|
|
||
| We do not propose applying child channel configurations to Out-of-Band (OOB) channels at this time. To maintain architectural flexibility and avoid breaking changes in the future, we will modify the implementation to use a `noOp()` MetricSink for OOB channels rather than inheriting the parent channel's sink. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove this section, since grpc/grpc-java#12589 is merged.
| // TODO(AgraVator): add example for configuring child channel observability | ||
| ``` | ||
|
|
||
| ## Rationale |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to say somewhere that we are not trying to provide any identity information about which channel is being created. The configuration provided by this gRFC is required to be uniform across child channels of a particular channel.
No description provided.