performance difference between AO and CAO
Hi, I see performance differences between AO and CAO models in calling MKL zgemm(or dgemm) routines. In my tests, AO is working well as expected but CAO shows poor performances compared to AO. For...
View ArticleFree Online training on Parallel Programming and Optimization
Colfax Is offering free Web-based workshops on Parallel Programming and Optimization for Intel® Architecture, including Intel® Xeon® processors and Intel® Xeon Phi™ coprocessors. Workshops include 20...
View ArticleSystem board for Intel Xeon Phi S5120D
Hi. Are there system boards compatible with Intel Xeon Phi S5120D (DFF form factor, PCI Express x24, 230 pins)?For example: Is it possible to use S5120D with Intel System Board S2600GZ (it has two PCIe...
View ArticleWindows 10 support
Dear Colleaggues,I am in the process of getting a workstation with a 3120A XEON Phi card. Due to Windows 7 memory limitation support (max 192GB) I am forced to consider Windows 8 or Windows 10. I am...
View ArticleOpenMP 4.0 target offload Report
Hi ..I am trying to make a comparison statistics of offload using,1). Intel compiler assisted offload VS. 2). OPENMP 4.0 target construct My QUESTION: HOW I CAN GET OPENMP 4.0 OFFLOAD REPORT(which...
View ArticleNo Cost Options for Intel Integrated Performance Primitives Library (IPP),...
The Intel® Integrated Performance Primitives Library (Intel® IPP), a high performance library with thousands of optimized functions for x86 and x86-64, is available for free for everyone (click here to...
View ArticleCompile OpenMP or MPI Fortran code for Intel Phi
Hi everyone,Here is my problem:I have two different programs:One in Fortran / MPIOne in Fortran / OpenMPAnd I would like to compile them in order to have them running on an Intel Xeon Phi.I just...
View ArticleHow to compile SSE intrinsic code in KNL
Hello Sir or Madam,As we know KNC not support SSE..., and AVX.., It's only support IMCI instruction. So SSE intrinsic code can't compile in KNC. How about KNL, KNL is support SSE...SSE4.2 and AVX...
View ArticleMIC on ubuntu 15.04
Hi,I am trying to build the mic module for Ubuntu. After following https://software.intel.com/en-us/forums/intel-many-integrated-core/topic...,...
View ArticleA Brief Survey of NUMA (Non-Uniform Memory Architecture) Literature
This document presents a list of articles on NUMA (Non-uniform Memory Architecture) that the author considers particularly useful. The document is divided into categories corresponding to the type of...
View ArticleUndefined MKL symbol when calling from within offloaded region
Hello,In an offloaded region of a Fortran90 application, I want to call MKL routines (dgetri/dgetrf) in a sequential way, that is, each thread on the MIC calls these routines with its own data. They...
View ArticleMKL: cholesky decomposition error wih Xeon Phi
Hi,I got a simple C++ code to call lapack dpotrf function to do cholesky decomposition and dgetrf and dgetri. I got very weird behavior. on Xeon server with 6 Xeon Phi Cards1) Performance:For...
View ArticleFinite Differences on Heterogeneous Distributed Systems
Download Zip Source CodeHere we exemplify how to expand Finite Difference (FD) computational kernels to run on distributed systems. Additionally, we describe a technique that shows how to deal with the...
View ArticleWhere to find detailed code examples for offload in Fortran under both Linux...
Hi,I must congratulate Intel and closely linked companies for a massive and detailed information on how to introduce parallellization in computation-heave codes! I also happened to win a sample of the...
View ArticleFinding elementwise and conditional matrix multiplication implementation with...
Hi all,I have been looking for an MKL version of elementwise matrix multiplication that works based on a condional approach.While Vmult can be used it is for only a 1D vector rather than a matrix.Below...
View ArticleNew article “Finite Differences on Heterogeneous Distributed Systems”
New article “Finite Differences on Heterogeneous Distributed Systems” http://software.intel.com/en-us/articles/finite-differences-on-heterogeneous-distributed-systems exemplifies cluster...
View ArticleIntrinsic to down-convert all 8 elements of i64 vectors to lower/higher 8...
Is there such a thing?I think pack/unpack intrinsics are somewhere close, but I could not understand exactly what it does.It seems fairly basic I almost feel stupid asking this, but I would really...
View ArticleDoes remove the printenv in the latest MPSS
the printenv command exist in MPSS version 3-2.1.6720-16, but it seems removed in 3.5. Does anyone know the reason?
View Articlehost-device bandwidth problem
Dear forum,I'm testing the host-device bandwidth using dapl fabric and Intel MPI (Isend/Irecv/Wait). 1.5 GB data are repeatedly sent back and forth. The initial result is:host to device: ~5.6 GB/sec...
View ArticleGLIBC_PRIVATE not defined in file ld-linux-x86-64.so.2 with link time reference
We are trying to set up a Gentoo-based system (4.0.5 kernel) as a development workstation capable of compiling offload applications for Xeon Phi. Because Gentoo does not work with RPMs, we installed...
View Article