Quantcast
Channel: Intel® Many Integrated Core Architecture
Viewing all articles
Browse latest Browse all 1347

Why Are Two Same Code and Data with Different Execution Time?

$
0
0

Hi everyone,

I found that when I run the axpy(y[i] = x[i] * a + y[i]) with two separate set of similar data, I got the totally different execution time as following. The attached file is the sample code for axpy.

My assumption is that the first time to run the inout pragma has to spend the time to prepare/preconfigure/preheat the Xeon Phi Coprocessor. If so, is there any official explanation to explain this odd situation? If not, what is the reason? Is there any better way to make a improvement or avoid for this situation?  It's really important for the benchmark. Because compare to NVIDIA/INTEL GPU/CPU, this situation never happens.

[liu@fornax Test_offomp]$ ./a.out

Total time for inout1 combined   = 0.39732003 sec

Total time for inout2 combined   = 0.01132083 sec

 

Best wishes,

Jiawen

AllegatoDimensione
Downloadtext/x-csrcaxpy.c1.26 KB

Viewing all articles
Browse latest Browse all 1347

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>