Skip to content

Retry middleware improvements #682

@LucioFranco

Description

@LucioFranco

This issue scopes out the changes we are proposing to the retry middleware to improve its ergonomics. Currently, the retry middleware is quite hard to use and requires implementing a custom Policy. Writing this policy is non-trivial and is more work than it should be.

These improvements are aimed at setting up retries with tower to be easier and more user friendly. As well as providing good defaults that work out of the box.

List of improvements to tower and tower-http:

  • Simplify Policy (retry: Change Policy to accept &mut self #681).
    • Change trait fn to take &mut self.
    • Change Future output to ().
    • Alllow Policy to be object safe.
  • Add generic backoff utilities. retry: Add generic backoff utilities #685
    • Add some trait Backoff that has an associated future type that allows others to use this utility without being tied to tokio::time.
    • Add a ExponentialBackoff type that implements Backoff, ported from linkerd2-proxy.
    • Add Rng utilities util: Add rng utilities #686
  • Budget improvements.
  • Add a new batteries included standard retry policy retry: Add StandardRetryPolicy and standard_policy mod #698
    • New StandardRetryPolicy combining impl Backoff and impl Budget.
    • Add StandardRetryPolicyBuilder that accept closures (?) for is_retryable(&mut Req, &mut Result<Res, E>) -> bool and a clone_request(&Req) -> Option<Req>.
  • tower-http improvements.
    • Add new retry module
    • Implement ReplayBody similar to the one implemented in linkerd2-proxy.
    • Add new HttpRetry layer that accepts higher level constructs for retrying, like ClassifyResponse, and will wrap http request bodies with ReplayBody.
  • Documentation
    • Blog post on how to setup retries with tower and tower-http.
    • Examples for thick clients with retries in both tower and tower-http.

Example code

tower examples with no http:

let policy = StandardRetryPolicy::builder()
    .should_retry(|res| match res {
        Ok(_) => false,
        Err(_) => true,
    })
    .clone_request(|r| Some(*r))
    .build();

let mut svc = ServiceBuilder::new()
    .retry(policy)
    .buffer(10)
    .timeout(Duration::from_secs(10))
    .service(svc);

tower-http examples:

let make_classifier = ServerErrorsAsFailures::make_classifier();

let mut svc = ServiceBuilder::new()
    .set_request_id("Request-Id".try_into().unwrap(), MakeRequestUuid)
    .retry(StandardHttpPolicy::new(
        make_classifier,
        ExponentialBackoff::default(),
        Budget::default(),
        |e| match e {
            ServerErrorsFailureClass::Status(s) => true,
            ServerErrorsFailureClass::Error(s) => false,
        },
    ))
    .timeout(Duration::from_secs(5))
    .service(client);

cc @rcoh @hawkw @jonhoo @seanmonstar @davidpdrsn

Metadata

Metadata

Assignees

Labels

A-retryArea: The tower "retry" middlewareC-enhancementCategory: A PR with an enhancement or a proposed on in an issue.E-help-wantedCall for participation: Help is requested to fix this issue.E-mediumCall for participation: Experience needed to fix: Medium / intermediateP-highHigh priority

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions