Hi all,
I am currently looking to offload from different MPI processor to the same MIC.
The offloads are asynchronous in nature.
What I would like to know is there a queue implemented by default that makes sure only a given MPI task at a time executes on the MIC?
Is there a partition of the MIC we can carry out such that each task offloads to a subset of cores?
How do I synchronise if no such thing exists?
Whats the solution?