Skip to content

High Latency for SCD DSS Requests #1311

@wing-utm-sharing-airspace

Description

Describe the bug
We've noticed timeouts on various DSS SCD API calls (exceeding 10s) in several of our DSS pools. Looking at the CRDB console for one instance of this issue, I'm seeing a lot of transaction restarts and very high tail latency for SCD Subscription queries in particular:

Image Image Image

This looks somewhat similar to #1241.

To Reproduce
Unsure the best way to reproduce but will provide some context about the deployment and traffic levels where we've seen this issue:

Below screenshot shows QPS overlayed with latency spike:
Image

This is a pooled deployment where 3 providers each have 3 CRDB nodes.
We're running DSS v0.20.0 and CRDB v24.1.3

Expected behavior
This is instance is receiving fairly spikey, high load compared with what we need to support our current operations but would appreciate investigation into the restarts and high latency to determine system limitations and available performance improvements.

Additional context
Sample of logs from around the time of the latency spike:
downloaded-logs-20251205-124443.json

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions