Fix stale connection errors (NoHttpResponseException) #45

gbuisson · 2025-12-10T00:17:13Z

Fix stale connections and add connection pool configuration

Problem: Production NoHttpResponseException errors caused by stale pooled
connections that the server had closed but the client was still trying to reuse.

Solution: Add validate-after-inactivity option (default: 5000ms) that checks idle
connections before reuse, preventing stale connection errors.

Additional improvements:

Connection pool tuning: :threads, :default-per-route, :insecure? options
Request timeouts: :connection-timeout (default: 10s), :socket-timeout (no
default, for long-running operations)
BREAKING: Renamed :timeout → :connection-ttl for clarity (in seconds, default:
60s)

Migration:
;; Before

  (conn/connect {:host "localhost" :port 9200})

;; After (same behavior, just explicit)

  (conn/connect {:host "localhost"
                 :port 9200
                 :connection-ttl 60              ; seconds
                 :validate-after-inactivity 5000 ; ms
                 :connection-timeout 10000})     ; ms

BREAKING CHANGE: Renamed :timeout to :connection-ttl for clarity. Connection pool options: - :connection-ttl (default: 30000ms) - how long connections live in the pool - :validate-after-inactivity (default: 5000ms) - checks idle connections before reuse, preventing NoHttpResponseException from stale connections - :threads (default: 100) - max total connections in pool - :default-per-route (default: 100) - max connections per route - :insecure? (default: false) - allow self-signed SSL certificates Request timeout options (applied to every request): - :connection-timeout (default: 10000ms) - time to establish TCP connection - :socket-timeout (default: none) - time to wait for response data Also removes deprecated PoolingClientConnectionManager from schema.

Addresses socket timeout errors occurring at exactly 10 seconds for long-running ElasticSearch queries (e.g., queries with 1000+ sub-requests that take several minutes). Root cause: After ductile PR #45 was merged, the new connection management defaults include a 10-second connection-timeout that is being reused as socket-timeout when not explicitly set. This causes intermittent failures for requests that take longer than 10 seconds. Solution: Explicitly set timeout parameters when creating ES connections: - socket-timeout: 600000ms (10 minutes) - allows long-running queries - connection-timeout: 10000ms (10 seconds) - reasonable for establishing connection - validate-after-inactivity: 5000ms (5 seconds) - prevents NoHttpResponseException This is a temporary workaround until ctia's properties schema is updated to support these new ductile parameters (socket-timeout, connection-timeout, validate-after-inactivity) as configurable properties. Related: - ductile PR #45: threatgrid/ductile#45 - Symptom: Requests failing at exactly 10s with socket timeout errors - Evidence: Some requests succeed at 16s, 24s, 28s while others fail at 10s

gbuisson requested review from DeLaGuardo, ereteog, frenchy64, marioaquino, msprunck and yogsototh as code owners December 10, 2025 00:17

gbuisson self-assigned this Dec 10, 2025

gbuisson added the review label Dec 10, 2025

gbuisson force-pushed the fix-stale-connections branch from 8f23af1 to 15a34ed Compare December 10, 2025 00:48

DeLaGuardo approved these changes Dec 10, 2025

View reviewed changes

ereteog approved these changes Dec 10, 2025

View reviewed changes

gbuisson merged commit f65762b into master Dec 10, 2025
2 checks passed

sayerada mentioned this pull request Dec 15, 2025

Fix: Add explicit timeouts for ElasticSearch connections threatgrid/ctia#1505

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix stale connection errors (NoHttpResponseException) #45

Fix stale connection errors (NoHttpResponseException) #45

Uh oh!

gbuisson commented Dec 10, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix stale connection errors (NoHttpResponseException) #45

Fix stale connection errors (NoHttpResponseException) #45

Uh oh!

Conversation

gbuisson commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gbuisson commented Dec 10, 2025 •

edited

Loading