-
Notifications
You must be signed in to change notification settings - Fork 27
Description
Story
RuntimeError: dictionary changed size during iteration occurs in EventStream.dispatch() method when processing events concurrently. This happens because the method iterates over cls.local_hooks.items() without creating a snapshot copy, while concurrent async tasks can modify the dictionary through subscribe() calls during event handler execution.
The issue is particularly notable because the same file already implements the correct pattern in the unsubscribe() method (line 31) using tuple(hooks.items()), but this pattern was not applied to the dispatch() method (line 214).
Affected Code Location:
File: ayon_server/events/eventstream.py
Lines: 214, 224
Problematic Code:
`# Line 214
for topic, handlers in cls.local_hooks.items(): # Missing tuple()
# ... event matching logic ...
# Line 224
for handler in handlers.values(): # Missing tuple()
try:
await handler(event) # Async call yields control
except Exception:
log_traceback(f"Error in event handler '{handler.__name__}'")`
``
To reproduce
Steps to reproduce the behavior:
- Run AYON backend with multiple Gunicorn workers (4-8 workers)
- Generate high concurrent load with multiple projects performing operations simultaneously
- Trigger events that dispatch to multiple handlers while addons or background tasks are subscribing to new event topics
- The error occurs intermittently under race conditions when:
Task A is iterating through local_hooks in dispatch()
Task A executes await handler(event) and yields control
Task B calls EventStream.subscribe() adding a new topic to local_hooks
Task A resumes iteration and encounters modified dictionary
Note: This is a race condition that occurs sporadically under high concurrency. Direct reproduction is difficult but the error has been captured in production Sentry logs.
Stack Trace from Sentry:
Task exception was never retrievedfuture: <Task finished name='Task-1131' coro=<_process_events() done, defined at /opt/mingo-server/ayon_server/operations/project_level/init.py:38> exception=RuntimeError('dictionary changed size during iteration')>File "ayon_server/operations/project_level/init.py", line 38, in _process_events for event in events: await EventStream.dispatch(...)File "ayon_server/events/eventstream.py", line 214, in dispatch for topic, handlers in cls.local_hooks.items():RuntimeError: dictionary changed size during iteration
Expected behavior
The dispatch() method should iterate over a snapshot of the dictionary to prevent RuntimeError when concurrent tasks modify local_hooks or handlers dictionaries during iteration. This is the same pattern already implemented in the unsubscribe() method.
`# Line 214
for topic, handlers in tuple(cls.local_hooks.items()): # Add tuple()
do_handle = False
if topic == event.topic:
do_handle = True
elif topic.endswith(".*"):
if event.topic.startswith(topic[:-2]):
do_handle = True
if not do_handle:
continue
# Line 224
for handler in tuple(handlers.values()): # Add tuple()
try:
await handler(event)
except Exception:
log_traceback(f"Error in event handler '{handler.__name__}'")`
Environment
Server version: 1.12.5
Server host OS: Ubuntu 22.04 LTS (Linux 5.15.0-163-generic)
Browser: N/A (Backend issue)
Client version: N/A (Backend issue)
Python version: 3.11.14
Gunicorn workers: 2-8 workers
Worker class: uvicorn.workers.UvicornWorker
Additional context
Severity: Low to Medium
Frequency: Rare (occurs sporadically under high concurrent load)
Impact: No service disruption (unhandled task exception only)
Detection: Captured in Sentry error monitoring
Analysis:
This is a classic race condition in async Python code. The class variable local_hooks is shared across all async tasks within a worker process. When dispatch() iterates over the dictionary and yields control during await handler(event), other tasks can call subscribe() which modifies the dictionary, causing the RuntimeError.
Why this pattern is needed:
tuple(dict.items()) creates a snapshot of the dictionary at iteration time
Changes to the original dictionary during iteration don't affect the snapshot
This is already the established pattern in the same file (line 31)
Proposed solution:
Add tuple() wrapper to both dictionary iterations in the dispatch() method (lines 214 and 224), making it consistent with the existing unsubscribe() implementation.
Related considerations:
No performance impact (tuple creation is O(n) where n is number of topics, typically small)
Thread-safe for the iteration scope
Maintains consistency with existing code patterns in the same file
I'm willing to submit a Pull Request with the fix if needed.
This was created with the help of Cursor AI.