Why it is so difficult to write AVX code on MIC!

Hello,

I am writing an AVX code to calculate the complex multiplication. The code is listed below,

1 typedef std::complex<float> Value;

2 void Benchmark::gridKernel(const int support,

3                            const Value C[],

4                            Value grid[], const int gSize)

5 {

6     int Nvec=8;

7     int nBlock,nrest,sSize_b;

8

9     nrest=sSize%Nvec;

10     nBlock=(sSize-nrest)/Nvec;

11     sSize_b=sSize-nrest;

12 …

13     for (int dind = bs; dind <= be; ++dind) {

14 …

15                 gind=…

16                 cind=…

17             Value gridc[sSize_b],Cc[sSize_b];

18             for (int suppu = 0; suppu < sSize_b; suppu++) {

19                gridc[suppu] = grid[gind+suppu];

20                Cc[suppu]    = C[cind+suppu];

21             }

22             const Value d = samples[dind].data;

23             for (int suppu = 0; suppu < nBlock; suppu++) {

24               int sl=suppu*Nvec;

25               __m512 sam = _mm512_load_ps(( Real *) &Cc[sl]);

26               __m512 *gridptr = (__m512 *) &gridc[sl];

27               __m512 data_r = _mm512_set1_ps(d.real());

28               __m512 data_i = _mm512_set1_ps(d.imag());

29               __m512 t7 = _mm512_mul_ps(data_r, sam);

30               __m512 t6 = _mm512_mul_ps(data_i, sam);

31               __m512 t8 = _mm512_swizzle_ps(t6,_MM_SWIZ_REG_CDAB);

32               __m512 t7c= t7;

33               __m512 t9 = _mm512_mask_sub_ps(t7c, 0x5555, t7, t8);

34               __m512 t9c= t9;

35               __m512 t10= _mm512_mask_add_ps(t9c, 0xAAAA, t9, t8);

36               gridptr[0] = _mm512_add_ps(gridptr[0], t10);

37             }//end suppu

38

39             for(int suppu=0;suppu<sSize_b;suppu++){

40                 grid[gind+suppu]=gridc[suppu];

41             }

42

43             for (int suppu = sSize_b; suppu < sSize; suppu++) {

44                 grid[gind+suppu] += d * C[cind+suppu];

45             }

46     }//end dind

47 }

As you see above, this code calculates the multiplication of “C” and “d”，and the results are added into array “grid”. The memory of array “grid” and “C” are allocated in another function with the following codes,

grid = (Value *) _mm_malloc(gSize*gSize*sizeof(Value),64);

if(grid == NULL) exit (1);

C = (Value *) _mm_malloc(sizeofC*sizeof(Value),64);

if(C == NULL) exit (1);

These two arrays are 64 bytes aligned. This code can be running on MIC correctly.

You may be very curious about why I use two temporary array “gridc” and “Cc” to hold pieces of array “grid” and “C” before the computation. That will add many memory copy and memory set operations and will reduce performance. Because if I delete these codes, including the codes from row 17 to row 21, and codes from row 39 to row 41, and replace codes from row 25 to row 26 with the following codes,

__m512 sam = _mm512_load_ps(( Real *) &C[cind + sl]);

__m512 *gridptr = (__m512 *) &grid[gind + sl];

There will be a “Segmentation fault (signal 11)” error when it is running on MIC card. The icpc version is 14.0.2.144 Build 20140120.

I don’t know where this error comes from, and how to solve it.

Any advice?

Shaohua

Why it is so difficult to write AVX code on MIC!

Trending Articles

The man who tried to murder John Gilligan

Windows Update / Microsoft Update の接続先 URL について

Colombo Mob Capo ‘Jersey Sal’ Profaci Cashes In Chips, Legendary NY Mafia...

GTA 5 PPSSPP Zip File Download For Android Mediafire 382 MB

スタンバイモードでのトランザクションログのリストア時にエラー 9004 が発生する事象について

Sabrina Carpenter – Short n’ Sweet [iTunes Plus M4A]

Sunny Garcia’s Ex-Wife Colleen McCullough

Best Suvichar in Hindi |बेस्ट सुविचार |शुभ विचार हिंदी में

Pumped up UDA chief ordered teen to be shot

SAP Fiori launchpad is empty with error : unable to load Group

Throw Back: 2Toff “Ye Na Bra” Ft. Castro

Eureka S01-S05 1080p BluRay DD5.1 H265-d3g

Black Angus Grilled Artichokes

Windows Time サービスの ID 36 の警告。これって無視しても大丈夫ですか？

Steinberg Cubase Pro 14.0.32 Incl V.R Unlocker

ZARIA CUMMINGS

Practice Sheet of Right form of verbs for HSC Students

3 Bathview, Mallow, Co. Cork. - €40,000

Sarah Samis, Emil Bove III

MHDD