Skip to content

Bug Report: RuntimeError in EventStream.dispatch() #830

@lantert246

Description

@lantert246

Story

RuntimeError: dictionary changed size during iteration occurs in EventStream.dispatch() method when processing events concurrently. This happens because the method iterates over cls.local_hooks.items() without creating a snapshot copy, while concurrent async tasks can modify the dictionary through subscribe() calls during event handler execution.
The issue is particularly notable because the same file already implements the correct pattern in the unsubscribe() method (line 31) using tuple(hooks.items()), but this pattern was not applied to the dispatch() method (line 214).
Affected Code Location:
File: ayon_server/events/eventstream.py
Lines: 214, 224
Problematic Code:
`# Line 214
for topic, handlers in cls.local_hooks.items(): # Missing tuple()
# ... event matching logic ...

# Line 224
for handler in handlers.values():  # Missing tuple()
    try:
        await handler(event)  # Async call yields control
    except Exception:
        log_traceback(f"Error in event handler '{handler.__name__}'")`

``

To reproduce

Steps to reproduce the behavior:

  1. Run AYON backend with multiple Gunicorn workers (4-8 workers)
  2. Generate high concurrent load with multiple projects performing operations simultaneously
  3. Trigger events that dispatch to multiple handlers while addons or background tasks are subscribing to new event topics
  4. The error occurs intermittently under race conditions when:
    Task A is iterating through local_hooks in dispatch()
    Task A executes await handler(event) and yields control
    Task B calls EventStream.subscribe() adding a new topic to local_hooks
    Task A resumes iteration and encounters modified dictionary
    Note: This is a race condition that occurs sporadically under high concurrency. Direct reproduction is difficult but the error has been captured in production Sentry logs.
    Stack Trace from Sentry:

Task exception was never retrievedfuture: <Task finished name='Task-1131' coro=<_process_events() done, defined at /opt/mingo-server/ayon_server/operations/project_level/init.py:38> exception=RuntimeError('dictionary changed size during iteration')>File "ayon_server/operations/project_level/init.py", line 38, in _process_events for event in events: await EventStream.dispatch(...)File "ayon_server/events/eventstream.py", line 214, in dispatch for topic, handlers in cls.local_hooks.items():RuntimeError: dictionary changed size during iteration

Expected behavior

The dispatch() method should iterate over a snapshot of the dictionary to prevent RuntimeError when concurrent tasks modify local_hooks or handlers dictionaries during iteration. This is the same pattern already implemented in the unsubscribe() method.

`# Line 214
for topic, handlers in tuple(cls.local_hooks.items()): # Add tuple()
do_handle = False
if topic == event.topic:
do_handle = True
elif topic.endswith(".*"):
if event.topic.startswith(topic[:-2]):
do_handle = True
if not do_handle:
continue

# Line 224
for handler in tuple(handlers.values()):  # Add tuple()
    try:
        await handler(event)
    except Exception:
        log_traceback(f"Error in event handler '{handler.__name__}'")`

Environment

Server version: 1.12.5
Server host OS: Ubuntu 22.04 LTS (Linux 5.15.0-163-generic)
Browser: N/A (Backend issue)
Client version: N/A (Backend issue)
Python version: 3.11.14
Gunicorn workers: 2-8 workers
Worker class: uvicorn.workers.UvicornWorker

Additional context

Severity: Low to Medium
Frequency: Rare (occurs sporadically under high concurrent load)
Impact: No service disruption (unhandled task exception only)
Detection: Captured in Sentry error monitoring
Analysis:
This is a classic race condition in async Python code. The class variable local_hooks is shared across all async tasks within a worker process. When dispatch() iterates over the dictionary and yields control during await handler(event), other tasks can call subscribe() which modifies the dictionary, causing the RuntimeError.
Why this pattern is needed:
tuple(dict.items()) creates a snapshot of the dictionary at iteration time
Changes to the original dictionary during iteration don't affect the snapshot
This is already the established pattern in the same file (line 31)
Proposed solution:
Add tuple() wrapper to both dictionary iterations in the dispatch() method (lines 214 and 224), making it consistent with the existing unsubscribe() implementation.
Related considerations:
No performance impact (tuple creation is O(n) where n is number of topics, typically small)
Thread-safe for the iteration scope
Maintains consistency with existing code patterns in the same file
I'm willing to submit a Pull Request with the fix if needed.

This was created with the help of Cursor AI.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions