Skip to content

Add built-in structured retry + rate-limit handling for onchain actions (RPC + CDP) #891

@shadowdames

Description

@shadowdames

Language Implementation

  • Python
  • TypeScript

Feature Type

  • Action Provider Template
  • Wallet Provider Template
  • Framework Extension
  • Core Requirements
  • Other

🚀 The feature, motivation and pitch

Overview

AgentKit agents often perform onchain actions (reads, writes, balance checks, swaps) through RPC endpoints and/or CDP-backed services. In real deployments, these calls frequently hit transient failures (timeouts, 5xx), rate limits (429), or provider-specific throttling. Today, each agent implementation typically re-invents retry logic, backoff, and error categorization.

Expected Behavior

  • Transient errors are retried automatically with safe defaults.
  • 429 responses respect Retry-After (when provided) and apply backoff.
  • Non-retryable errors fail fast with clear error types.
  • Retry behavior is configurable globally and per action provider.
  • Developers can opt out completely (e.g., retries: 0) if desired.

Steps to reproduce

  1. Build an agent that repeatedly queries balances and submits a transaction.
  2. Run it against an RPC endpoint with throttling (or under heavy load).
  3. Observe intermittent timeouts / 429s cause the agent to fail and stop the task.
  4. Add custom retries in userland and notice duplication across actions.

Alternatives

Proposed Solution

  • Introduce a shared resilience layer used by all AgentKit network-bound actions (RPC, CDP, and future providers).
  • Implement configurable retry policies:
    • retry count
    • backoff strategy (exponential with jitter)
    • maximum delay
    • optional respect for Retry-After headers
  • Classify errors into retryable vs non-retryable (e.g., timeouts/5xx/429 vs validation/config errors).
  • Expose typed errors such as RateLimitedError, TransientNetworkError, and NonRetryableError.
  • Allow configuration at two levels:
    • global defaults during AgentKit initialization
    • per-provider overrides where needed
  • Add optional instrumentation hooks (e.g., onRetry) for logging and telemetry.

Additional Information

  • This change reduces duplicated retry logic across agent implementations.
  • Improves reliability for long-running or autonomous agents in production.
  • Aligns AgentKit behavior with best practices for resilient API and RPC clients.
  • Provides a foundation for future observability and metrics integrations.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions