Quantcast
Channel: Intel® Many Integrated Core Architecture
Viewing all articles
Browse latest Browse all 1347

Simple Offload Example Failing

$
0
0

Hello,

I'm attempting to run a simple offload example: 

#include <stdio.h>
#include <omp.h>

int main(){
double sum; int i,n, nt;

   n=2000000000;
   sum=0.0e0;

   #pragma offload target(mic:0)
   {
    #pragma omp parallel for reduction(+:sum)
    for(i=1;i<=n;i++){
       sum = sum + i;
    }
    //nt = omp_get_max_threads();
    #pragma omp parallel
    {
       #pragma omp single
       nt = omp_get_num_threads();
    }

    #ifdef __MIC__
       printf("Hello MIC reduction %f threads: %d\n",sum,nt);
    #else
       printf("Hello CPU reduction %f threads: %d\n",sum,nt);
    #endif
   }
}

This program ran fine previously but we recently rebooted our Phi nodes in our cluster and since then this offloading example will not run. The native compiled MIC binaries still run without a problem since the reboot.

Before running I type:

. /usr/local/intel/ClusterStudioXE_2013/composer_xe_2013_sp1/bin/compilervars.sh intel64
make
export MIC_OMP_NUM_THREADS=120
export MIC_ENV_PREFIX=MIC
export OFFLOAD_REPORT=3

Here is my Makefile:

CC=icc
CFLAGS=-std=c99 -O3 -vec-report3 -openmp -offload
EXE=reduce_offload_mic

$(EXE) : reduce_omp_mic.c
	$(CC) -o $@ $< $(CFLAGS)

.PHONY: clean

clean:
	rm $(EXE)

However, when I run the program here is the output:

[frenchwr@vmp903 Offload]$ ./reduce_offload_mic
offload error: cannot offload to MIC - device is not available
[Offload] [HOST]  [State]   Unregister data tables

I have ensured that mpss is running and even restarted the service with:

sudo service mpss restart

but still the same error (even after re-building the executable).

All of my mic tests pass:

[frenchwr@vmp903 Offload]$ miccheck
MicCheck 3.4-r1
Copyright 2013 Intel Corporation All Rights Reserved

Executing default tests for host
  Test 0: Check number of devices the OS sees in the system ... pass
  Test 1: Check mic driver is loaded ... pass
  Test 2: Check number of devices driver sees in the system ... pass
  Test 3: Check mpssd daemon is running ... pass
Executing default tests for device: 0
  Test 4 (mic0): Check device is in online state and its postcode is FF ... pass
  Test 5 (mic0): Check ras daemon is available in device ... pass
  Test 6 (mic0): Check running flash version is correct ... pass
  Test 7 (mic0): Check running SMC firmware version is correct ... pass
Executing default tests for device: 1
  Test 8 (mic1): Check device is in online state and its postcode is FF ... pass
  Test 9 (mic1): Check ras daemon is available in device ... pass
  Test 10 (mic1): Check running flash version is correct ... pass
  Test 11 (mic1): Check running SMC firmware version is correct ... pass

Status: OK

Here's the output from micinfo:

[frenchwr@vmp903 Offload]$ micinfo
MicInfo Utility Log
Created Fri Aug 28 18:14:23 2015


	System Info
		HOST OS			: Linux
		OS Version		: 2.6.32-431.29.2.el6.x86_64
		Driver Version		: 3.4-1
		MPSS Version		: 3.4
		Host Physical Memory	: 132110 MB

Device No: 0, Device Name: mic0

	Version
		Flash Version 		 : 2.1.02.0390
		SMC Firmware Version	 : 1.16.5078
		SMC Boot Loader Version	 : 1.8.4326
		uOS Version 		 : 2.6.38.8+mpss3.4
		Device Serial Number 	 : ADKC42900304

	Board
		Vendor ID 		 : 0x8086
		Device ID 		 : 0x225c
		Subsystem ID 		 : 0x7d95
		Coprocessor Stepping ID	 : 2
		PCIe Width 		 : Insufficient Privileges
		PCIe Speed 		 : Insufficient Privileges
		PCIe Max payload size	 : Insufficient Privileges
		PCIe Max read req size	 : Insufficient Privileges
		Coprocessor Model	 : 0x01
		Coprocessor Model Ext	 : 0x00
		Coprocessor Type	 : 0x00
		Coprocessor Family	 : 0x0b
		Coprocessor Family Ext	 : 0x00
		Coprocessor Stepping 	 : C0
		Board SKU 		 : C0PRQ-7120 P/A/X/D
		ECC Mode 		 : Enabled
		SMC HW Revision 	 : Product 300W Passive CS

	Cores
		Total No of Active Cores : 61
		Voltage 		 : 1037000 uV
		Frequency		 : 1238095 kHz

	Thermal
		Fan Speed Control 	 : N/A
		Fan RPM 		 : N/A
		Fan PWM 		 : N/A
		Die Temp		 : 46 C

	GDDR
		GDDR Vendor		 : Samsung
		GDDR Version		 : 0x6
		GDDR Density		 : 4096 Mb
		GDDR Size		 : 15872 MB
		GDDR Technology		 : GDDR5
		GDDR Speed		 : 5.500000 GT/s
		GDDR Frequency		 : 2750000 kHz
		GDDR Voltage		 : 1501000 uV

Device No: 1, Device Name: mic1

	Version
		Flash Version 		 : 2.1.02.0390
		SMC Firmware Version	 : 1.16.5078
		SMC Boot Loader Version	 : 1.8.4326
		uOS Version 		 : 2.6.38.8+mpss3.4
		Device Serial Number 	 : ADKC42900319

	Board
		Vendor ID 		 : 0x8086
		Device ID 		 : 0x225c
		Subsystem ID 		 : 0x7d95
		Coprocessor Stepping ID	 : 2
		PCIe Width 		 : Insufficient Privileges
		PCIe Speed 		 : Insufficient Privileges
		PCIe Max payload size	 : Insufficient Privileges
		PCIe Max read req size	 : Insufficient Privileges
		Coprocessor Model	 : 0x01
		Coprocessor Model Ext	 : 0x00
		Coprocessor Type	 : 0x00
		Coprocessor Family	 : 0x0b
		Coprocessor Family Ext	 : 0x00
		Coprocessor Stepping 	 : C0
		Board SKU 		 : C0PRQ-7120 P/A/X/D
		ECC Mode 		 : Enabled
		SMC HW Revision 	 : Product 300W Passive CS

	Cores
		Total No of Active Cores : 61
		Voltage 		 : 1040000 uV
		Frequency		 : 1238095 kHz

	Thermal
		Fan Speed Control 	 : N/A
		Fan RPM 		 : N/A
		Fan PWM 		 : N/A
		Die Temp		 : 47 C

	GDDR
		GDDR Vendor		 : Samsung
		GDDR Version		 : 0x6
		GDDR Density		 : 4096 Mb
		GDDR Size		 : 15872 MB
		GDDR Technology		 : GDDR5
		GDDR Speed		 : 5.500000 GT/s
		GDDR Frequency		 : 2750000 kHz
		GDDR Voltage		 : 1501000 uV

 

From searching online I see a few other users who have run into the:

offload error: cannot offload to MIC - device is not available
[Offload] [HOST]  [State]   Unregister data tables

issue, but I don't see any good resolution (other than by restarting mpss, which does not resolve the issue for me).


Viewing all articles
Browse latest Browse all 1347


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>