Hi
I have been doing experiments on Xeon Phi and ran an fibonacci(40) application with varied number of core allocations (i.e. number of cores). The energy consumption was measured through /sys/class/micras/power and performance (i.e. execution time) was measured using elapsed time.
I get the following trade-off:
Cores Execution_time(ms) Energy(J)
1 142450.897 11123.35
2 312780.938 24676.25
4 172721.295 13849
8 104500.9 8527.75
16 60941.152 5176.8
24 43881.031 3854.75
32 31231.054 2828.1
40 25801.364 2450.9
48 23231.324 2267.2
56 19880.978 1997.7
61 17380.948 1784.5
As you can see there is a sudden jump in the execution time from core allocation of 1 to 2, which is also causing the energy to be much higher. Other number of core allocations are I suppose normal. But I cannot explain this trade-off when the core allocation increases from 1 to 2.
Here is the application if you'd like to try it out:
------------
int fib(int n)
{
int i, j;
if (n<2)
return n;
else
{
//omp_set_num_threads(NUM_CPUS);
#pragma omp task shared(i) firstprivate(n)
i=fib(n-1);
#pragma omp task shared(j) firstprivate(n)
j=fib(n-2);
#pragma omp taskwait
return i+j;
}
}
int main()
{
int n = 40;
omp_set_dynamic(1);
#pragma omp parallel shared(n)
{
#pragma omp single
printf ("\033[37;1mfib(%d) = %d\033[0m \n", n, fib(n));
#pragma omp single
printf ("CPUs in Parallel = %d \033\n", omp_get_num_threads());
}
return 0;
}
------------