Intel is evaluating to offer a 4 hour web-based basic tutorial covering the fundamental principles of how to integrate an Intel Xeon Phi coprocessor into a Linux based cluster.
During the course each attendant would have remote access to a Linux server and be able to do each step as shown in the outline below. The course will be given free of charge. Requirements are an Internet connection, a web browser, and Putty. We are settling on the sharing technology we will be using, and will publish that at a later date.
If you are interested in such an offer please reply to this forum thread -- you have the ability to reply privately, if you don't want to be identified.
If we have enough interest, we'll pull it together!
Topics:
- Finding information on Intel Xeon Phi coprocessor on the web
- Download the driver software
- Unpacking the driver software package, explanation of components
- Discussion on prerequisites of the compute server (for instance what software needs to be installed, reserved IP addresses, user names, network file systems)
- Basic concepts (host, host OS, host kernel, coprocessor, Intel(R) MPSS stack, layout of files, boot image of the uOS, ramfs of the uOS)
- Recompiling HOST kernel packages; diagnose output and understand errors (necessary to work with nonstandard kernels)
- Install a minimal set of MPSS rpm packages using rpm
- Create a default MPSS configuration (using “micctrl --initdefaults “)
- Startup (aka boot) the coprocessor
- Connect via minicom to the coprocessor (this allows one to connect to the Intel Xeon Phi coprocessor WITHOUT figuring out network problems)
- modify uOS filesystem by overlaying an /etc/passwd file; reboot the coprocessor
- Create a bridgded network on the host
- Configure the coprocessor for bridged networking by modifying micX.conf directly
- Reboot card and connect via ssh
- Set up a ssh key-pair; diagnose ssh gotchas
- Mount a NFS file system on the coprocessor
- Configure a user known in the cluster by modifying the /etc/passwd file of the coprocessor
- Group up with neighbor – run MPI benchmark natively over Ethernet
- Recompile the MPSS OFED package to support a nonstandard kernel on the HOST
- Install MPSS-OFED rpms
- Start OFED on the coprocessor
- Group up with neighbor – run MPI benchmark natively over InfiniBand
- Create a minimal startup script wrapping everything up; this startup script can be used by a batch scheduling system to restart a coprocessor on behalf of a user before running a job.
- Where to find more resources or ask questions
In future (advanced) courses we might want to talk about topics like:
- Change micX.conf configuration to adopt to diskless clients
- The tools (micinfo, micrasd, ….)
- Logging – syslog, sar (how to enable logging, typical output and problems to look out for)
- cron: maintenance log rotation, detect issues like missing daemons, problems
- Adding more programs/libraries/files to the coprocessors (including pros and cons)
- Troubleshooting startups/shutdowns
- Upgrading the MPSS stack on an established installation
- Diagnose system and coprocessor health
- Customizing installation scripts