Skip to content

Conversation

@fishcakez
Copy link
Contributor

Observed situation where worker crashes and next client process gets the new worker. Unfortunately the new worker might be busy asynchronously starting, for example when the worker is setting up a tcp connection. If there is another worker it would be nice for the client to get that worker as it is more likely to be ready. Using the fifo strategy fits nicely with the existing options to help solve this as other workers will be checked out before the new worker - if there are any.

This fix is also useful to avoid the problem that can occur when using lifo queues where a client times out on a call and the worker gets checked back in. If the next client gets the "slow" worker it is also likely to timeout. This can lead to a grouping of timeouts until the worker catches up or crashes. This patch allows to isolate that timeout as much as possible by using a combination of fifo queue and gen_server:stop/3 or exit/2 before checking in the worker because the replacement worker is put at the back of the worker queue.

@fishcakez
Copy link
Contributor Author

/cc @pdilyard

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant