Intel MKL linpack benchmark gets killed on Xeon Phi

Hi all,

I've got a weird problem: I wanted to test the GLOPS performance of the Xeon Phi's that are entrusted to me: 2 x Xeon Phi 5110P, 1x Xeon Phi 7120 . I read that the linpack benchmark is included in Intel's MKL libs and that a Xeon Phi version is included. So I grabbed the binaries and ran them on my Xeon Phi's.

On the 7120 (with mpss 3.3.2) the benchmark runs fine:

Thu Feb 12 16:58:54 CET 2015
Intel(R) Optimized LINPACK Benchmark data

Current date/time: Thu Feb 12 16:58:54 2015

CPU frequency:    1.238 GHz
Number of CPUs: 1
Number of cores: 244
Number of threads: 244

Parameters are set to:

Number of tests: 14
Number of equations to solve (problem size) : 2048  4096  6144  8192  10240 12288 14336 16384 18432 20480 22528 24576 26624 28672
Leading dimension of array                  : 2112  6208  6208  8256  10304 12352 14400 18496 18496 20544 22592 26688 26688 28736
Number of trials to run                     : 3     3     3     3     3     3     3     3     3     3     3     3     3     3
Data alignment value (in Kbytes)            : 4     4     4     4     4     4     4     4     4     4     4     4     4     4

Maximum memory requested that can be used=6591927552, at the size=28672
Performance Summary (GFlops)

Size   LDA    Align.  Average  Maximal
2048   2112   4       62.4610  89.8029
4096   6208   4       254.9105 260.5183
6144   6208   4       399.6637 404.3374
8192   8256   4       484.3184 491.6444
10240  10304  4       577.4737 587.8460
12288  12352  4       639.3712 643.3008
14336  14400  4       696.0603 701.3388
16384  18496  4       744.9810 748.8416
18432  18496  4       788.7247 791.7044
20480  20544  4       818.3679 820.8570
22528  22592  4       846.7491 848.7561
24576  26688  4       868.7217 870.2109
26624  26688  4       884.2233 885.7552
28672  28736  4       896.8622 896.9412

Residual checks PASSED

End of test

However, on both 5110P's (with mpss 3.4.2) the benchmark gets killed before it is complete!

mic0 $ cd linpack/
mic0 $ export LD_LIBRARY_PATH=$PWD
mic0 $ ./runme_mic
This is a SAMPLE run script for SMP LINPACK. Change it to reflect
the correct number of CPUs/threads, problem input files, etc..
Fri Feb 13 10:01:12 CET 2015
Intel(R) Optimized LINPACK Benchmark data

Current date/time: Fri Feb 13 10:01:12 2015

CPU frequency:    1.053 GHz
Number of CPUs: 1
Number of cores: 240
Number of threads: 240

Parameters are set to:

Number of tests: 14
Number of equations to solve (problem size) : 2048  4096  6144  8192  10240 12288 14336 16384 18432 20480 22528 24576 26624 28672
Leading dimension of array                  : 2112  6208  6208  8256  10304 12352 14400 18496 18496 20544 22592 26688 26688 28736
Number of trials to run                     : 3     3     3     3     3     3     3     3     3     3     3     3     3     3
Data alignment value (in Kbytes)            : 4     4     4     4     4     4     4     4     4     4     4     4     4     4

Maximum memory requested that can be used=6591927552, at the size=28672

=================== Timing linear equation system solver ===================

Size   LDA    Align. Time(s)    GFlops   Residual     Residual(norm) Check
2048   2112   4      0.596      9.6303   4.795780e-12 3.950479e-02   pass
2048   2112   4      0.073      78.7107  4.795780e-12 3.950479e-02   pass
2048   2112   4      0.074      77.8766  4.795780e-12 3.950479e-02   pass
4096   6208   4      0.214      214.2289 2.216840e-11 4.613649e-02   pass
4096   6208   4      0.203      225.7619 2.216840e-11 4.613649e-02   pass
4096   6208   4      0.204      224.5814 2.216840e-11 4.613649e-02   pass
6144   6208   4      0.457      338.6425 3.562570e-11 3.301736e-02   pass
6144   6208   4      0.445      347.2770 3.562570e-11 3.301736e-02   pass
6144   6208   4      0.446      346.9953 3.562570e-11 3.301736e-02   pass
8192   8256   4      0.900      407.1775 7.232445e-11 3.782865e-02   pass
8192   8256   4      0.869      421.7898 7.232445e-11 3.782865e-02   pass
8192   8256   4      0.867      422.8278 7.232445e-11 3.782865e-02   pass
10240  10304  4      1.449      494.0793 1.010026e-10 3.389721e-02   pass
10240  10304  4      1.373      521.5753 1.010026e-10 3.389721e-02   pass
10240  10304  4      1.371      522.2989 1.010026e-10 3.389721e-02   pass
12288  12352  4      2.241      552.0942 1.454923e-10 3.393283e-02   pass
12288  12352  4      2.184      566.5285 1.454923e-10 3.393283e-02   pass
12288  12352  4      2.185      566.1465 1.454923e-10 3.393283e-02   pass
14336  14400  4      3.313      592.9472 2.006193e-10 3.448820e-02   pass
14336  14400  4      3.228      608.5453 2.006193e-10 3.448820e-02   pass
14336  14400  4      3.224      609.3674 2.006193e-10 3.448820e-02   pass
16384  18496  4      4.621      634.5835 2.524725e-10 3.324476e-02   pass
16384  18496  4      4.462      657.1922 2.524725e-10 3.324476e-02   pass
16384  18496  4      4.461      657.3274 2.524725e-10 3.324476e-02   pass
./runme_mic: line 45:  5271 Killed                  ./xlinpack_$arch lininput_$arch
Done: Fri Feb 13 10:05:15 CET 2015

How can I debug this? a 'gdb' run shows nothing, it just states that all threads get killed. The "runme_mic" script is from the MKL kit itself:

#!/bin/sh
[....]
echo "This is a SAMPLE run script for SMP LINPACK. Change it to reflect"
echo "the correct number of CPUs/threads, problem input files, etc.."

#    Setting up affinity for better threading performance
export KMP_AFFINITY=explicit,granularity=fine,proclist=[1-$(($(cat /proc/cpuinfo|grep proc|wc -l)-1)),0]

arch=mic
{
  date
  ./xlinpack_$arch lininput_$arch
  echo -n "Done: "
  date
} | tee lin_$arch.txt

What's going wrong ? how can I debug this? I've tried it with binaries from both the Intel v14 and Intel v15 compilers.

Intel MKL linpack benchmark gets killed on Xeon Phi

Trending Articles

Bath man appears in court charged with attempted murder of a man...

MACLEAN, Allan

Black Angus Grilled Artichokes

Practice Sheet of Right form of verbs for HSC Students

Police blotter for Jan. 12

99 God Status for Whatsapp, Facebook

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Notorious Naushad of Ippa gang nabbed

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Sonible Smartlimit v1.1.5-R2R

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Arrow Flash 2 – Sinhala Dubbed – Episode 23 – 20th March 2016

[GET] AI Traffic Goldmine

[E² Plugin] HDF-Radio

Universal Multi-Patch v1.3 By RADIXX11

IWAN – Thanks and Praise ( Throw Back Thursday )

RONALD P SONDERGAARD Arrested by Miami-Dade County Corrections on Mar 03, 2017

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

HSSC Excise & Taxation Inspector Result 2017 Scorecard/ Category Wise Merit List