Hi,
I have written a code whose skeleton looks like below.
#define CPU_THREADS 4 #define INPUTSIZE 4 #pragma omp parallel num_threads(CPU_THREADS) { #pragma omp for for(i =0; i < INPUTSIZE; i++) { .................................. ..............some code............... .. .................................. for(j=0; j < 100; j++) { ......some code....... #pragma offload target(mic) in(a[0:size] alloc_if(0) free_if(0)) out() { #pragma omp parallel num_threads(60) { #pragma omp for for(i=0; i< 240; i++) { .....................some code................ .......................................... } } } } } }
So here each CPU THREAD gets 60 MIC threads. I want to set my affinity in such a way that, 1st CPU THREAD has to use first cores of xeonphi(4 threads per core). 2nd CPU THREAD has to use from 15-30 cores. simlilarly 3rd and 4th CPU THREADS has to use 30-45, 45-60 cores.
Here what i observed is if i set KMP_AFFINITY= compact. only 15 cores are getting used. I think MIC threads from 0-4 of each CPU THREAD are getting mapped to core 0. Is there any way i can set AFFINITY based on their CPUTHREAD number.
Please help me on this. Please ask if you need any furthur clarifications.
Also, I noticed that my program is getting hanged at offload call some times(if i use multiple threads its hanging. in single thread mode it is working fine). what could be the reasons for it. Can i suspect the memory allocations that are happening during offload call are the reason?
Thanks
sivaramakrishna