Skip to content

Conversation

@drmingdrmer
Copy link
Member

@drmingdrmer drmingdrmer commented Dec 10, 2025

Changelog

docs: update getting-started guide to use RaftNetworkV2 as primary trait

Modernize the getting-started documentation to focus on RaftNetworkV2
as the recommended network trait. Add documentation for the optional
stream_append() method for pipelined replication.

refactor: use stream_append for linearizable read confirmation

Replace append_entries with stream_append in linearizable read
confirmation so applications only need to implement stream_append.

refactor: use stream_append for heartbeat instead of append_entries

Replace append_entries with stream_append in HeartbeatWorker so
applications only need to implement stream_append for replication.

docs: gRPC example: implement gRPC bidirectional streaming for pipeline replication

Replace chunked append_entries with native gRPC bidirectional streaming
via stream_append. This provides more efficient pipelined log replication.

Changes:

  • Add StreamAppend RPC with bidirectional streaming to proto
  • Implement stream_append server handler in raft_service.rs
  • Implement stream_append client in network/mod.rs
  • Remove chunked append_entries fallback logic
  • Change RaftNetworkV2::stream_append to accept 'static stream
  • Update README to reflect streaming pipeline approach
  • Delete obsolete test_chunk.rs
feat: add pipeline mode for streaming replication

Replication to a follower has two phases:

  1. Binary search phase: The leader runs a binary search to find the exact
    matching position of log entries on the follower.

  2. Pipeline mode: After finding the match point, the leader calls the
    stream_append method on the network and continuously generates
    AppendEntries requests. The network implementation should pipeline all
    requests to the follower and yield responses. Note that responses and
    requests don't have to be 1-to-1 mapped - the number of responses can be
    smaller than the number of requests.

stream_append provides a default implementation that calls the existing
append_entries method to emulate streaming replication. A mature
implementation should run in real pipeline mode instead of request-response
manner. When a request is received by stream_append, it is responsible for
sending all content of the request to the follower - partial success is not
allowed.

On the leader, the Inflight structure in Engine tracks the inflight
replication session running by ReplicationCore. An InflightId identifies
each inflight session, and the Inflight structure ignores any response that
doesn't match the current InflightId.

This change also reorganizes replication data structures for clarity:

  • Add Payload enum for log replication specifications (LogIdRange, LogsSince)
  • Add Replicate struct combining inflight_id with Payload
  • Add ReplicationProgress to track local committed and remote matched state
  • Simplify drain_events() to set fields directly instead of returning values
  • Remove obsolete request.rs, replication_state.rs, log_state.rs

Changes:

  • Add pipeline mode entry in ProgressEntry::next_send() when fully caught up
  • Add Inflight::LogsSince variant for unbounded log streaming
  • Add is_logs_since() method to Inflight for type checking
  • Add get_partial_success() method to AppendEntriesResponse
  • Refactor drain_events() to set next_action and inflight_id directly
  • Add unit tests for pipeline mode in progress/entry/tests.rs
feat: add streaming replication with I/O progress synchronization

Enable replication tasks to synchronize with leader I/O progress using
watch channels. The replication stream monitors io_accepted_rx and
io_submitted_rx to detect leader changes and wait for log availability.

feat: add watch channel for I/O acceptance notification

Add io_accepted_tx watch channel to notify observers before I/O operations
are submitted to storage. This enables preparation for upcoming I/O events
before they actually happen.

Changes:

  • Add io_accepted_tx watch channel to RaftCore
  • Broadcast I/O acceptance before UpdateIOProgress, AppendEntries, and SaveVote
test: add assertion for watch channel changed() behavior

Add test assertion to verify that after changed() returns ready, a
subsequent call returns pending because the value was already seen.

Changes:

  • Add assertion for changed() returning pending after value is marked as seen
feat: add watch channel for I/O submission progress broadcast

Add io_submitted_tx watch channel to notify replication tasks when log
entries have been submitted to storage and are safe to read. This enables
replication tasks to coordinate with I/O progress without polling.

Changes:

  • Add io_submitted_tx watch channel to RaftCore
  • Broadcast I/O submission progress after AppendEntries, SaveVote, and UpdateIOProgress
chore: add doc
feat: add streaming replication support with LogsSince variant

Add infrastructure for open-ended streaming replication where the leader
continuously sends logs after a given point without a fixed upper bound.
This complements the existing fixed-range Logs replication mode.

chore: remove unused inflight-id

  • Improvement
  • Build/Testing/CI

This change is Reviewable

@drmingdrmer drmingdrmer changed the title docs: update getting-started guide to use RaftNetworkV2 as primary trait feat: add pipeline mode for streaming replication Dec 10, 2025
@drmingdrmer drmingdrmer force-pushed the 285-pipeline branch 6 times, most recently from 7feca32 to 829ee8a Compare December 14, 2025 09:07
Collection of small improvements including documentation, logging,
configuration tuning, and test robustness fixes.

Changes:
- Reduce default network backoff from 500ms to 200ms for faster retries
- Add doc comments for `IOProgress` fields (`accepted`, `submitted`, `flushed`)
- Add debug logging to client-http example
- Add test assertion for watch channel `changed()` pending behavior
- Fix metrics test to handle missing heartbeat entries gracefully
Verify that a LogReader obtained before writing new entries can still
read entries written after it was created. This ensures LogReader
implementations don't cache or snapshot data in a way that makes newly
written entries invisible.
Add infrastructure for open-ended streaming replication where the leader
continuously sends logs after a given point without a fixed upper bound.
This complements the existing fixed-range `Logs` replication mode.
Add `io_submitted_tx` watch channel to notify replication tasks when log
entries have been submitted to storage and are safe to read. This enables
replication tasks to coordinate with I/O progress without polling.

Changes:
- Add `io_submitted_tx` watch channel to `RaftCore`
- Broadcast I/O submission progress after `AppendEntries`, `SaveVote`, and `UpdateIOProgress`
Add `io_accepted_tx` watch channel to notify observers before I/O operations
are submitted to storage. This enables preparation for upcoming I/O events
before they actually happen.

Changes:
- Add `io_accepted_tx` watch channel to `RaftCore`
- Broadcast I/O acceptance before `UpdateIOProgress`, `AppendEntries`, and `SaveVote`
Enable replication tasks to synchronize with leader I/O progress using
watch channels. The replication stream monitors io_accepted_rx and
io_submitted_rx to detect leader changes and wait for log availability.
Replication to a follower has two phases:

1. **Binary search phase**: The leader runs a binary search to find the exact
   matching position of log entries on the follower.

2. **Pipeline mode**: After finding the match point, the leader calls the
   `stream_append` method on the network and continuously generates
   AppendEntries requests. The network implementation should pipeline all
   requests to the follower and yield responses. Note that responses and
   requests don't have to be 1-to-1 mapped - the number of responses can be
   smaller than the number of requests.

`stream_append` provides a default implementation that calls the existing
`append_entries` method to emulate streaming replication. A mature
implementation should run in real pipeline mode instead of request-response
manner. When a request is received by `stream_append`, it is responsible for
sending all content of the request to the follower - partial success is not
allowed.

On the leader, the `Inflight` structure in Engine tracks the inflight
replication session running by `ReplicationCore`. An `InflightId` identifies
each inflight session, and the `Inflight` structure ignores any response that
doesn't match the current `InflightId`.

This change also reorganizes replication data structures for clarity:
- Add `Payload` enum for log replication specifications (`LogIdRange`, `LogsSince`)
- Add `Replicate` struct combining `inflight_id` with `Payload`
- Add `ReplicationProgress` to track local committed and remote matched state
- Simplify `drain_events()` to set fields directly instead of returning values
- Remove obsolete `request.rs`, `replication_state.rs`, `log_state.rs`

Changes:
- Add pipeline mode entry in `ProgressEntry::next_send()` when fully caught up
- Add `Inflight::LogsSince` variant for unbounded log streaming
- Add `is_logs_since()` method to `Inflight` for type checking
- Add `get_partial_success()` method to `AppendEntriesResponse`
- Refactor `drain_events()` to set `next_action` and `inflight_id` directly
- Add unit tests for pipeline mode in `progress/entry/tests.rs`
…ne replication

Replace chunked append_entries with native gRPC bidirectional streaming
via `stream_append`. This provides more efficient pipelined log replication.

Changes:
- Add `StreamAppend` RPC with bidirectional streaming to proto
- Implement `stream_append` server handler in `raft_service.rs`
- Implement `stream_append` client in `network/mod.rs`
- Remove chunked `append_entries` fallback logic
- Change `RaftNetworkV2::stream_append` to accept `'static` stream
- Update README to reflect streaming pipeline approach
- Delete obsolete `test_chunk.rs`
Replace `append_entries` with `stream_append` in HeartbeatWorker so
applications only need to implement `stream_append` for replication.
Replace `append_entries` with `stream_append` in linearizable read
confirmation so applications only need to implement `stream_append`.
…trait

Modernize the getting-started documentation to focus on `RaftNetworkV2`
as the recommended network trait. Add documentation for the optional
`stream_append()` method for pipelined replication.
When rebuilding replication streams after a membership change, reuse
existing streams instead of destroying all and recreating. This avoids
unnecessary stream teardown and maintains in-flight replication state.

Changes:
- Reuse existing replication streams when targets remain in new membership
- Only spawn new replication for newly added targets
- Properly join and cleanup removed replication streams
- Handle missing progress entries gracefully in `update_matching()`
- Add debug logging for membership change operations
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant