-
Notifications
You must be signed in to change notification settings - Fork 98
impl(pubsub): message ordering in Publisher #4124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
impl(pubsub): message ordering in Publisher #4124
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4124 +/- ##
==========================================
- Coverage 95.25% 95.24% -0.02%
==========================================
Files 178 178
Lines 6811 6808 -3
==========================================
- Hits 6488 6484 -4
- Misses 323 324 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
suzmue
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as ordering, this looks pretty good. However, I think this removed support for non-ordered messages where we can have multiple inflight batches.
Please let me know if I'm mistaken.
src/pubsub/src/publisher/worker.rs
Outdated
| let (tx, rx) = oneshot::channel(); | ||
| batch_worker | ||
| .send(ToBatchWorker::Flush(tx)) | ||
| .expect("Batch worker should not close the channel"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Consider using a helper message or helper function to avoid the duplication of this error message all over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to a helper message for now. It's likely to change when we implement the batch worker cleanup PR.
| /// `Worker` drops the Sender. | ||
| pub(crate) async fn run(mut self) { | ||
| // While it is possible to use Some(JoinHandle) here as there is at max | ||
| // a single inflight task at any given time, the use of JoinSet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't true when ordering key is "".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a TODO to handle the empty ordering key special case.
| break; | ||
| } | ||
| self.move_to_batch(); | ||
| self.pending_batch.flush(self.client.clone(), self.topic.clone(), &mut inflight); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using flush for the timer seems slightly inefficient since if we have a max batch size of 1000 and we have a batch out and 1001 messages in the queue, then we we will send that single message as a batch of 1 even if more come in. I guess that is a possibility with having the timeout anyway, so we maybe its not a real problem and we don't need it to be changed necessarily.
This also makes we wonder actually if we have an outstanding batch, would it be better to just leave all messages in the channel? to avoid having to copy things around as much? I'm just curious about this. Something to maybe try out later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted. Lets discuss during our meeting tomorrow. It largely depends on how strict we want to respect the user configured batch size.
Note: In the worst case, we still need to batch if the batch size is greater than the backend limit.
Co-authored-by: Suzy Mueller <suzmue@google.com>
|
@suzmue , thanks for the review. |
| // While it is possible to use Some(JoinHandle) here as there is at max | ||
| // a single inflight task at any given time, the use of JoinSet | ||
| // simplify the managing the inflight JoinHandle. | ||
| // TODO(#4012): There is no inflight restriction when the ordering key is "". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a little concerned about delaying this to a future PR since this is a performance regression, but since we are still in preview I think its alright if it improves dev velocity for this to be submitted.
Support messaging ordering in Publisher by creating a new BatchWorker which handles that batching and message ordering logic.
When the Worker receives an Publish operation, it demultiplex it into the corresponding BatchWorker which ensures that there is at most one inflight batch at any given time.
When the Worker receives a Flush operation, it flushes all BatchWorker pending message and waits until the operation is complete.
As of now, the set of BatchWorkers grows indefinitely. This is be address in the next PR.