Simple offloaded code, enormous time consuming

Dear all,

I recently started using Xeon Phi cards for parallel programming, so I am still a newbie in this field.

I wrote this code as a simple example to start understanding this fascinating world, but I got surprised when I looked at the time of executions.

When I run the code on the host, execution time is 0,08 s. When I run the code adding the pragma offload and pragma omp parallel for, execution time increase up to 9s!

When I compiled the codes, I used -O3 optimization for both of them.

Is there something I am missing?

Thanks for your help,

Flavio

#include<stdio.h>
#ifdef _OPENMP
#include<omp.h>
#endif

#define ALLOC alloc_if(1) free_if(0)
#define RETAIN alloc_if(0) free_if(0)
#define FREE alloc_if(0) free_if(1)

#define LD long double

#define MAX 100000

main(int argc, char **argv)
{
    int i, j;
    LD *M = NULL;
    __declspec(target(mic))int cycles = 240;

    printf("array lenght: %d\n", cycles);

    //start time
    char            fmt[64], buf1[64], buf2[64];
    struct timeval  tv;
    struct tm       *tm;
    gettimeofday(&tv, NULL);
    if((tm = localtime(&tv.tv_sec)) != NULL){
        strftime(fmt, sizeof fmt, "((%H*1440)+(%M*60)+%S,%%06u)", tm);
        snprintf(buf1, sizeof buf1, fmt, tv.tv_usec);
    }

    //array creation
        M = (LD*)calloc(cycles, sizeof(LD));

    //allocating space on MIC
    #pragma offload target(mic) in(M:length(cycles) ALLOC)
    {}

    for (i=0; i<MAX; i++){
        #pragma offload target(mic) inout(M:length(cycles) RETAIN) \
                                    in(cycles)
        {
            #pragma omp parallel for private(j)
            #pragma ivdep
            for (j=0; j<cycles; j++)
                M[j] += 1;
        } //offload
    } //for

    //freeing space on MIC
    #pragma offload target(mic) nocopy(M:length(0) FREE)
    {}

    printf("number of cycles: %LG\n", M[0]);

    //tempo finale
    gettimeofday(&tv, NULL);
    if((tm = localtime(&tv.tv_sec)) != NULL){
        strftime(fmt, sizeof fmt, "=((%H*1440)+(%M*60)+%S,%%06u)", tm);
        snprintf(buf2, sizeof buf2, fmt, tv.tv_usec);
        printf("%s-%s\n", buf2, buf1);
    }

    return 0;
} //main

Simple offloaded code, enormous time consuming

Trending Articles

Scuffham Amps - S-GEAR 2.6.0 VST, AAX, STANDALONE x86 x64 (R2R NO iLok2, +NO...

Practice Sheet of Right form of verbs for HSC Students

VHSE First (1st) Allotment 2025 - vhscap.kerala.gov.in

UNIVERSE LEAGUE – UNIVERSE LEAGUE – WAR (We Are Ready) – EP [iTunes Plus M4A]

City Hunter Teledrama – Episode 18 – 07th May 2016

Comment on Proposed Criteria for Identifying Predatory Conferences by Luke...

Bureau of Internal Revenue: Regional Offices (Directory)

Kendrick Lamar – Not Like Us (2024) [24Bit-88.2kHz] [PMEDIA] ⭐️

Inception 2010 Hindi Dual Audio 650MB BRRip 720p ESubs HEVC

East Hull MD admits sexual assaults after another victim comes forward

Download: Ziba Zako ft Rich Bizzy & General Kanene – Chikwati (Prod by: Bicko...

R. v. Sargeant, 2023 ONSC 6406 (CanLII)

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Who’s been sentenced at Northampton Magistrates’ Court

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Family cries out as traditional ruler allegedly abducts brother, extorts N2.5m

Long-Running Conflict In Springfield (MA) Gangland Sphere Has Manzi Family &...

Wondershare Filmora X v10.1.20.16 x64

Man arrested after fracas in flat

Man charged in ongoing Sexual Assault Investigation Derek Nyilas, 46, Faces...