Quantcast
Channel: Intel® Many Integrated Core Architecture
Viewing all articles
Browse latest Browse all 1347

Performance issues with Intel MPI (barriers) between Xeon Phi coprocessors

$
0
0

I'm getting bad performance with MPI barriers in a microbenchmark on this system configuration:

  • multiple Xeon Phi coprocessors
  • Intel MPSS 3.5 (April 2015), Linux
  • Intel MPI 5.0 update 3
  • OFED-3.12-1
export I_MPI_MIC=1
export I_MPI_DEBUG=5
export I_MPI_FABRICS=shm:dapl
export I_MPI_DAPL_PROVIDER=ofa-v2-scif0
export I_MPI_PIN_MODE=lib
export I_MPI_PIN_CELL=core
/opt/intel/impi/5.0.3.048/intel64/bin/mpirun -hosts mic0,mic1 -ppn 30 -n 60 ./exe

(( omitted tons of debug lines: DAPL and processor pinning are occurring correctly ))
[0] MPI startup(): I_MPI_DAPL_PROVIDER=ofa-v2-scif0
[0] MPI startup(): I_MPI_DEBUG=5
[0] MPI startup(): I_MPI_FABRICS=shm:dapl
[0] MPI startup(): I_MPI_MIC=1
[0] MPI startup(): I_MPI_PIN_MAPPING=30:0 1,1 9,2 17,3 25,4 33,5 41,6 49,7 57,8 65,9 73,10 81,11 89,12 97,13 105,14 113,15 121,16 129,17 137,18 145,19 153,20 161,21 169,22 177,23 185,24 193,25 201,26 209,27 217,28 225,29 0

# OSU MPI Barrier Latency Test
# Avg Latency(us)
          1795.31

I'm not sure if these results are representative of the mpirun configuration I've used (two coprocessors, ppn=30).

Additional results (ppn=60, 120 PEs):

/opt/intel/impi/5.0.3.048/intel64/bin/mpirun -hosts mic0,mic1 -ppn 60 -n 120 ./exe
# OSU MPI Barrier Latency Test
# Avg Latency(us)
          5378.48

Viewing all articles
Browse latest Browse all 1347


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>