-
Notifications
You must be signed in to change notification settings - Fork 4.1k
[Feature] Support Idle Callback in TaskGroup for flexibility usage #3168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds support for idle callbacks in TaskGroup to enable flexible thread-per-core I/O management. The feature allows users to register a callback that executes custom logic (e.g., io_uring polling) when worker threads are idle, enabling efficient async I/O without cross-thread signaling.
Key Changes
- Added
SetWorkerIdleCallbackAPI to register user-defined callbacks that run when workers have no tasks - Modified
ParkingLot::waitto support optional timeout for periodic wake-ups during idle periods - Implemented idle callback invocation in
TaskGroup::wait_taskwith configurable timeout-based polling
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 12 comments.
| File | Description |
|---|---|
| test/bthread_idle_unittest.cpp | Adds unit test verifying worker thread isolation and idle callback execution with thread-local contexts |
| src/bthread/task_group.h | Declares idle callback API with function pointer typedef, static members for callback state, and timeout parameter |
| src/bthread/task_group.cpp | Implements idle callback registration and invocation logic with timeout-based polling in wait loops |
| src/bthread/parking_lot.h | Extends wait method signature to accept optional timeout parameter for conditional waiting |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
52da249 to
767005f
Compare
src/bthread/task_group.h
Outdated
| // should check the runqueue again immediately. | ||
| // |timeout_us|: The timeout for waiting if the callback returns false. | ||
| // 0 is not acceptable. | ||
| static bool SetWorkerIdleCallback(OnWorkerIdleFn fn, void* user_ctx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The user interface of bthread should be placed in the src/bthread/bthread.h or src/bthread/unstable.h. And it should be C style, like bthread_set_worker_idle_callback(xxx).
See bthread_set_worker_startfn for reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refine the code by using brpc's design pattern, providing a bthread_set_xxx function for user.
src/bthread/task_group.cpp
Outdated
| return false; | ||
| } | ||
|
|
||
| std::call_once(g_worker_idle_once, [fn, user_ctx, timeout_us]() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is over-designed. These config only initialize on program startup, you don't need to use std::call_once or atomic variables. See bthread_set_worker_startfn for reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refine the code by using brpc's design pattern, providing a bthread_set_xxx function for user.
|
If the request is submitted to the local io_uring function in pthread1, this bthread may be scheduled to pthread2 later. In this case, pthread1 still needs to reap the corresponding CQE and then notify it. |
I am not sure if I understand your comment correctly.
I assume the I am new to brpc, so correct me if i understand incorreclty, thanks. BTW, the idle function here is only useful when there's only a few requests, to make sure we can reap the |
- Implement Idle Hook: Added TaskGroup::SetWorkerIdleCallback to allow executing custom logic (e.g., IO polling) when a worker thread is idle. - Support Timeout Wait: Modified ParkingLot::wait to support an optional timeout, preventing workers from sleeping indefinitely when an idle callback is registered. - Enable Thread-per-Core IO: Enabled thread-local IO management (like io_uring ) by invoking the hook within the worker's thread context. - Add Unit Test: Added bthread_idle_unittest to verify worker isolation and idle callback execution.
767005f to
cac054a
Compare
| return true; | ||
| } | ||
| _pl->wait(st); | ||
| // Instead of waiting for signal, we shall wake up if there's a user idle task here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is repeated twice. Maybe wrap it into a helper function.
|
I believe you're trying to create a run-to-complete model. In a bRPC scenario, the easiest ways to implement this model seem to be:
The network uses an event-triggered approach, while the storage uses a polling approach, which seems to mismatch the models. |
I disagree with the notion that asynchronous requests in storage during iouring require a one-loop (thread)-per-core concept. Because bthreads lack scheduler pause points, we can only simulate async operations across other threads, leading to greater overhead. essentially, we are trying to make bthread worker CPU resource utilization more efficient under async io |
|
What I mean is that in the io_uring scenario, if polling mode is not applicable, eventfd can be used to register io_uring events with epoll. I think the efficiency problem of the bthread scheduling model is a common issue unrelated to io_uring. |
The reason I added an idle function here is, we want to have a chance to run some user-defined code during the idle time of the task group's pthread worker. Reaping some iouring cqe is one of its use cases, we can also use this mechanism for other purposes:
|
|
Yes, I know you want a mechanism to harvest asynchronous responses, and this PR #2560 is actually designed to support this scenario. It doesn't require modifying the bthread scheduling strategy. |
What problem does this PR solve?
The main reason that we need this is that:
lastone, then we will rely on the Idle Callback in the task group to wake it up.Issue Number: none
Problem Summary:
What is changed and the side effects?
Changed:
timeoutparam towait()function, with default NULL value, which will not break current implementation.Side effects:
Performance effects:
No
Breaking backward compatibility:
NO
Check List: