Skip to content

Conversation

@simonrosenberg
Copy link
Collaborator

@simonrosenberg simonrosenberg commented Jan 8, 2026

Summary

This PR fixes the secondary issue from #1633: websocket disconnect handling that could potentially cause server instability.

Problem

When websocket connections encountered ConnectionError, these exceptions were being re-raised along with RuntimeError, which could potentially cause server instability.

Solution

Modified websocket handlers to:

  • Handle ConnectionError gracefully (connection-related errors are safe to suppress)
  • Keep re-raising RuntimeError to surface actual bugs (as suggested in code review)
  • Use explicit except ConnectionError clause instead of isinstance check (cleaner)

Changes

  • sockets.py: Modified events_socket and bash_events_socket to handle ConnectionError gracefully while still re-raising RuntimeError
  • test_event_router_websocket.py: Updated test to verify RuntimeError is still re-raised

Checklist

  • If the PR is changing/adding functionality, are there tests to reflect this?
  • If there is an example, have you run the example to make sure that it works?
  • If there are instructions on how to run the code, have you followed the instructions and made sure that it works?
  • If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name?
  • Is the github CI passing?

Fixes #1633 (partial - websocket disconnect fix)


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.12-nodejs22 Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:f2091f2-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-f2091f2-python \
  ghcr.io/openhands/agent-server:f2091f2-python

All tags pushed for this build

ghcr.io/openhands/agent-server:f2091f2-golang-amd64
ghcr.io/openhands/agent-server:f2091f2-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:f2091f2-golang-arm64
ghcr.io/openhands/agent-server:f2091f2-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:f2091f2-java-amd64
ghcr.io/openhands/agent-server:f2091f2-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:f2091f2-java-arm64
ghcr.io/openhands/agent-server:f2091f2-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:f2091f2-python-amd64
ghcr.io/openhands/agent-server:f2091f2-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:f2091f2-python-arm64
ghcr.io/openhands/agent-server:f2091f2-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:f2091f2-golang
ghcr.io/openhands/agent-server:f2091f2-java
ghcr.io/openhands/agent-server:f2091f2-python

About Multi-Architecture Support

  • Each variant tag (e.g., f2091f2-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., f2091f2-python-amd64) are also available if needed

@github-actions
Copy link
Contributor

github-actions bot commented Jan 8, 2026

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-agent-server/openhands/agent_server
   sockets.py1055547%50–51, 57–59, 68–70, 76–78, 84, 87, 91–93, 95–96, 110–111, 113–114, 116–118, 121, 123–126, 128–129, 131–135, 138–140, 143, 147–149, 151–152, 155, 162–163, 177–181, 191
TOTAL14788709852% 

Modified websocket handlers to:
- Handle ConnectionError gracefully (connection-related, safe to suppress)
- Keep re-raising RuntimeError to surface actual bugs
- Use explicit except clause for ConnectionError (cleaner than isinstance)

Changes:
- events_socket: Handle ConnectionError gracefully, re-raise RuntimeError
- bash_events_socket: Same handling
- Cleanup (unsubscription) still happens in the finally block

Fixes #1633 (partial - websocket disconnect fix)

Co-authored-by: openhands <openhands@all-hands.dev>
@simonrosenberg simonrosenberg force-pushed the openhands/fix-websocket-disconnect-handling branch from 49c73bd to 050ceca Compare January 8, 2026 11:58
@enyst enyst requested a review from tofarr January 8, 2026 12:05
@all-hands-bot
Copy link
Collaborator

[Automatic Post]: This PR seems to be currently waiting for review. @tofarr, could you please take a look when you have a chance?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Bash command polling stops after 2 attempts, causing agent loop to hang

4 participants