Channel: Intel® Many Integrated Core Architecture

↧

missing compiler prefetches in intrinsics code with linear memory access

August 28, 2014, 5:45 pm

Latest and popular articles on Intel Technologies

≫ Next: How to compile for the Phi from a remote host?

≪ Previous: mpss 3.3: unable to install libldap-dev k1om in mic

Hi all,

the Intel Compiler 14.0.3 does not insert software prefetches for the following linear test program

#include <iostream>
#include <immintrin.h>

int main() {

    const int elements = 1e7;

    const int mem_size = 16 * elements * sizeof(float); // 640 MB

    float *vec_a = (float*)_mm_malloc( mem_size, 64 );
    float *vec_b = (float*)_mm_malloc( mem_size, 64 );

    // initialization
    for ( int i = 0; i < 16*elements ; ++i ) {

        vec_a[i] = 0.8f;
        vec_b[i] = 0.6f;
    }

    #pragma omp parallel
    {
        const __m512 mass_ = _mm512_set1_ps( 0.123f );

        __m512 vec_a_, vec_b_;

        #pragma omp for schedule(static)
        for ( int i = 0; i < 16*elements ; i += 16 ) {

            vec_a_ = _mm512_load_ps( vec_a + i );
            vec_b_ = _mm512_load_ps( vec_b + i );

            vec_a_ = _mm512_fmadd_ps( mass_, vec_a_, vec_b_ );

            _mm512_storenrngo_ps( vec_b + i, vec_a_ );
        }
    }

    // prevent deadcode optimizations
    float delta = 0.0f;

    for ( int i = 0; i < 16*elements ; ++i ) {

        delta += vec_b[i];
    }

    std::cout << delta << std::endl;

    _mm_free( vec_a );
    _mm_free( vec_b );
}

The Compiler generates the following assembler (icpc -O3 -mmic -openmp -S -masm=intel linear.cpp)

..B1.36:
             mov       r8, QWORD PTR [r13]
             add       rcx, 16
             vmovaps   zmm0, ZMMWORD PTR [r8+rax]
             mov       dl, dl
             mov       r9, QWORD PTR [r14]
             vfmadd213ps zmm0, zmm1, ZMMWORD PTR [r9+rax]
             vmovnrngoaps ZMMWORD PTR [r9+rax], zmm0
             add       rax, 64
             cmp       rcx, rdx
             jle       ..B1.36

so... no software prefetches. Of course, I could insert prefetch intrinsics, but I guess that the compiler should be much better in doing that for a linear memory access? I did try to use #pragma prefetch and -opt-prefetch=4 with no success. It seems to be a compiler problem, because the Intel compiler 15.0b does insert prefetch instructions.

However, the current 15.0b compiler generates a 30% slower code for my bigger program.

So my question is: How can I force the 14.0 compiler to insert software prefetches for linear intrinsics code?

Thanks,

Patrick

↧

Latest Images

7 clever tricks Primark does to keep you walking & buying more than you need...

7 clever tricks Primark does to keep you walking & buying more than you need...

July 20, 2025, 5:14 am

Art for Everyone! Autism advocacy, local stories, and indigenous pride in one...

Art for Everyone! Autism advocacy, local stories, and indigenous pride in one...

July 20, 2025, 5:06 am

Paintings of English Downs 2

Paintings of English Downs 2

July 20, 2025, 4:30 am

How Kerala Women Rescued a Dying Forest and Turned It Into a Safe Haven for...

How Kerala Women Rescued a Dying Forest and Turned It Into a Safe Haven for...

July 20, 2025, 3:30 am

Met Eireann warns of heavy rain & spot flooding for DAYS before big...

Met Eireann warns of heavy rain & spot flooding for DAYS before big...

July 20, 2025, 1:14 am

Who is Kevin Lerena’s wife Geraldine?

Who is Kevin Lerena’s wife Geraldine?

July 20, 2025, 12:57 am

Man stabs woman, baby to death inside Queens home, police say

Man stabs woman, baby to death inside Queens home, police say

July 19, 2025, 11:00 pm

Ang papel ni whistleblower Julie Patidongan sa kaso ng mga nawawalang sabungero

Ang papel ni whistleblower Julie Patidongan sa kaso ng mga nawawalang sabungero

July 19, 2025, 9:45 pm

Telangana Human Rights Commission (TGHRC) seeks report from revenue dept on...

Telangana Human Rights Commission (TGHRC) seeks report from revenue dept on...

July 19, 2025, 7:29 pm

Crisis-hit NHS fat cats raking in MASSIVE salaries as frontline services cry...

Crisis-hit NHS fat cats raking in MASSIVE salaries as frontline services cry...

July 19, 2025, 2:11 pm

Trending Articles

At the end of an episode of television show parking wars it says in memory of...

July 25, 2011, 7:33 pm

Sarah Samis, Emil Bove III

November 17, 2012, 9:36 pm

RAD Studio Community Edition XE 10.2.3 build 3231, some links to explanations...

July 19, 2018, 6:28 pm

Flux Full Pack 2.1 v3.5.16-R2R

May 6, 2016, 3:14 am

बिना कपड़े उतारे भी लें सकते हैं सेक्स का मज़ा, ट्राई करें ये नया तरीकाबिना...

August 3, 2019, 7:08 pm

James Martin Normandy tart on James Martin’s French Adventure

February 21, 2017, 7:26 am

#MungaTheThief : Man Who’s Son Committed Suicide At Black Diamond Has Looted...

January 22, 2017, 8:06 pm

In court: a round up of cases heard by Essex magistrates

July 5, 2014, 10:00 pm

England Font 2020-2021

October 25, 2020, 9:02 pm

Waves Complete v2019.02.14 Incl Emulator-R2R

February 16, 2019, 7:50 am

Shoreline Mafia, OhGeesy & Fenix Flexin – ROCKIN – Single [iTunes Plus M4A]

June 28, 2025, 4:14 pm

Tyler, The Creator – CHROMAKOPIA [iTunes Plus M4A]

October 28, 2024, 4:56 am

Students hit streets to save Agriculture College land in city

October 13, 2018, 2:20 am

A/L Technology Stream – Subject combinations, Syllabuses and Teacher guides

December 17, 2013, 6:12 pm

Toni Braxton - Toni Braxton [2xCD Remastered Deluxe Edition] (2016)

November 27, 2016, 12:34 am

Toni Braxton - Secrets [2xCD Remastered Deluxe Edition] (2016) Lossless & 320

November 27, 2016, 12:34 am

Treecard Games (6 in 1) Keygen v1.7 By DeltaFoX

September 23, 2019, 9:37 pm

Practice Sheet of Right form of verbs for HSC Students

September 22, 2019, 11:40 pm

Love (2015).H264.Italian.English.Ac3.5.1.multisub.iCV-MIRCrew Seed (62)/Leech...

September 14, 2017, 10:49 am

100+ Short Whatsapp Status in English | Short Status Quotes Words

March 22, 2017, 12:27 am

Latest Images

7 clever tricks Primark does to keep you walking & buying more than you need...

7 clever tricks Primark does to keep you walking & buying more than you need...

July 20, 2025, 5:14 am

Art for Everyone! Autism advocacy, local stories, and indigenous pride in one...

Art for Everyone! Autism advocacy, local stories, and indigenous pride in one...

July 20, 2025, 5:06 am

Paintings of English Downs 2

Paintings of English Downs 2

July 20, 2025, 4:30 am

How Kerala Women Rescued a Dying Forest and Turned It Into a Safe Haven for...

How Kerala Women Rescued a Dying Forest and Turned It Into a Safe Haven for...

July 20, 2025, 3:30 am

Met Eireann warns of heavy rain & spot flooding for DAYS before big...

Met Eireann warns of heavy rain & spot flooding for DAYS before big...

July 20, 2025, 1:14 am

Who is Kevin Lerena’s wife Geraldine?

Who is Kevin Lerena’s wife Geraldine?

July 20, 2025, 12:57 am

Man stabs woman, baby to death inside Queens home, police say

Man stabs woman, baby to death inside Queens home, police say

July 19, 2025, 11:00 pm

Ang papel ni whistleblower Julie Patidongan sa kaso ng mga nawawalang sabungero

Ang papel ni whistleblower Julie Patidongan sa kaso ng mga nawawalang sabungero

July 19, 2025, 9:45 pm

Telangana Human Rights Commission (TGHRC) seeks report from revenue dept on...

Telangana Human Rights Commission (TGHRC) seeks report from revenue dept on...

July 19, 2025, 7:29 pm

Crisis-hit NHS fat cats raking in MASSIVE salaries as frontline services cry...

Crisis-hit NHS fat cats raking in MASSIVE salaries as frontline services cry...

July 19, 2025, 2:11 pm

© 2025 //www.rssing.com