-
Notifications
You must be signed in to change notification settings - Fork 50
Open
Labels
Description
In our MPI implementation node A has to call comm.send and node B has to call comm.recv for a message to be successfully communicated from node A to node B. In contrast, gRPC only requires calling receive from the other node. gRPC does not require send because each node is running a gRPC server on a parallel thread. The gRPC approach is much more scalable because it does not require synchronization between sender and receiver.
Therefore, we need to write an efficient MPI based server which, similar to gRPC, runs on a parallel thread and serves models and tensors when requested by a receiving user.
If it is too much unnecessary effort, then we should also consider dropping MPI entirely.