Quantcast
Channel: Intel® Many Integrated Core Architecture
Viewing all articles
Browse latest Browse all 1347

calling _mm512_i32extgather_epi32 emits invalid upconv argument error

$
0
0

Hi, I've been working with Xeon Phi to get an optimal performance out of a simple, offloaded lookup -> return function.

Below is the line that's causing problem:

__m512i     vec =       _mm512_i32extgather_epi32 (v_index, p_lookup_table, _MM_UPCONV_EPI32_UINT16, sizeof(uint16_t), _MM_HINT_NT);

The table has been allocated with _mm_malloc with 4KB alignment for DMA and each entry is a 16-bit unsigned integer, hence the upconversion to 32bit int. (And 2 byte scale)

Apparently, however, it is not a valid upconversion argument to intrinsic.

I believe I have used the right header (immintrin.h), and right indices that would surely not generate segfaults (although I don't think that's what this is about).

Any thoughts or opinions will be greatly appreciated.

Also, I'm accessing those lookup tables from all possible threads. Is there any way I can replicate the tables to the number of memory controllers in total (8) and bind each copy to each controller to maximize read throughput? (Somewhat like how you run applications with numactl or code with <numa.h> included)


Viewing all articles
Browse latest Browse all 1347

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>