-
Notifications
You must be signed in to change notification settings - Fork 578
Open
Description
Language Implementation
- Python
- TypeScript
Feature Type
- Action Provider Template
- Wallet Provider Template
- Framework Extension
- Core Requirements
- Other
🚀 The feature, motivation and pitch
Overview
AgentKit agents often perform onchain actions (reads, writes, balance checks, swaps) through RPC endpoints and/or CDP-backed services. In real deployments, these calls frequently hit transient failures (timeouts, 5xx), rate limits (429), or provider-specific throttling. Today, each agent implementation typically re-invents retry logic, backoff, and error categorization.
Expected Behavior
- Transient errors are retried automatically with safe defaults.
- 429 responses respect
Retry-After(when provided) and apply backoff. - Non-retryable errors fail fast with clear error types.
- Retry behavior is configurable globally and per action provider.
- Developers can opt out completely (e.g.,
retries: 0) if desired.
Steps to reproduce
- Build an agent that repeatedly queries balances and submits a transaction.
- Run it against an RPC endpoint with throttling (or under heavy load).
- Observe intermittent timeouts / 429s cause the agent to fail and stop the task.
- Add custom retries in userland and notice duplication across actions.
Alternatives
Proposed Solution
- Introduce a shared resilience layer used by all AgentKit network-bound actions (RPC, CDP, and future providers).
- Implement configurable retry policies:
- retry count
- backoff strategy (exponential with jitter)
- maximum delay
- optional respect for
Retry-Afterheaders
- Classify errors into retryable vs non-retryable (e.g., timeouts/5xx/429 vs validation/config errors).
- Expose typed errors such as
RateLimitedError,TransientNetworkError, andNonRetryableError. - Allow configuration at two levels:
- global defaults during
AgentKitinitialization - per-provider overrides where needed
- global defaults during
- Add optional instrumentation hooks (e.g.,
onRetry) for logging and telemetry.
Additional Information
- This change reduces duplicated retry logic across agent implementations.
- Improves reliability for long-running or autonomous agents in production.
- Aligns AgentKit behavior with best practices for resilient API and RPC clients.
- Provides a foundation for future observability and metrics integrations.
Metadata
Metadata
Assignees
Labels
No labels