ホーム > Home (English) > System > Genkai > Genkai Software > Intel onAPI, Intel MPI

Intel Compiler, MKL, MPI

Last update: 27 Nov. 2024


The Intel compiler is one of the software components of the Intel oneAPI base & HPC toolkit. This page mainly explains how to compile programming languages Fortran, C, and C++, and how to create scripts for batch processing.


Available users

Kyushu Univ. users Academic users Non-academic users
OK OK OK

Module

Module name Version
intel OneAPI 2023.2
intel/2024.1 OneAPI 2024.1
Refer to the following page for the usage of modules:
Module usage

Commands

Commands of Intel Compiler for compilation and linkage are as follows:

Language Command Optimization options*1 OpenMP*2
Parallel without MPI Fortran ifort -xCORE-AVX512 -qopenmp
C icc
C++ icpc
Parallel with MPI*3 Fortran mpiifort
C mpiicc
C++ mpiicpc

*1 Recommended option for creating optimized load modules.
*2 OpenMP options are disabled by default.
*3 Please load the MPI library module for MPI parallel.


Compilation and Linkage

Setup Environment (module)

module command should be executed at the compile time and the execution time.

OneAPI 2024.1

$ module load intel/2024.1

OneAPI 2023.2

$ module load intel

Intel MPI

$ module load intel
$ module load impi


C/C++

Use icc / icpc commands for C/C++ programs without MPI, and mpiicc / mpiicpc commands for C/C++ programs with MPI. (Following examples show commands for C programs. Change the command to icpx and mpiicpx when you compile C++ programs.)

Example 1) Serial C program

$ icc sample.c

Example 2) Thread parallel C program with OpenMP

$ icc -qopenmp sample.c

Example 3) Process parallel C program with MPI

$ mpiicc sample.c

Example 4) Hybrid parallel C program with OpenMP and MPI

$ mpiicc -qopenmp sample.c


Fortran

Use ifort command for compiling and linking Fortran programs. To compile parallel programs with MPI, use mpiifort command.

Example 1) Serial Fortran program

$ ifort ample.f90

Example 2) Thread parallel Fortran program with OpenMP

$ ifort -qopenmp sample.f90

Example 3) Process parallel Fortran program with MPI

$ mpiifort sample.f90

Example 4) Hybrid parallel Fortran program with OpenMP and MPI

$ mpiifort -qopenmp sample.f90


Frequently Used Options

Option Description
-c Create object files only.
-o filename Specify name of the executable file. “a.out” is used by default.
-On Specify optimization level (n=0–3). n=0 means no optimization. -O2 is used by default.
-fast Use a recommended set of optimization options. (Same as -xHOST -O3 -ipo -no-prec-div -static -fp-model fast=2)
-qopenmp Enable OpenMP. Disabled by default.
-opt-report[n] Reports on the optimizations applied in the compilation. Larger n means more detailed information. (Default n is 2, and no report is generated when n=0.)
-opt-report-routine=string Report on the optimizations applied on functions or subroutines with the names that include specified string.
-qmkl Link mathematical library Intel Math Kernel Library (MKL). Mainly used for tuned BLAS, LAPACK and vector numerics library, etc. Refer to the next section for details.
-mt_mpi
(for Intel MPI) Link the thread-safe MPI library.

This option should be specified for the programs that chose MPI_THREAD_FUNNELED, MPI_THREAD_SERIALIZED or MPI_THREAD_MULTIPLE as the thread mode.
If -qopenmp or -parallel is specified, thread-safe MPI library will be linked without this options. |
| -ilp64(for Intel MPI) | Assume all integers appear in MPI functions as 64bit integer. (Supports ILP64)
Used in the case -i8 option is specified for Fortran programs. |
| -free or -nofixed(Fortran only) | Notify that the source code is written in free format. |
| -fixed or -nofree(Fortran only) | Notify that the source code is written in fixed format. |
| -extend-source number | Notify that the source code is written in fiexed format with the line length of number. Default value of number is 132. |

Refer to the document of “Intel HPC Toolkit Documentation” for detailed information about options.


Intel Math Kernel Library

Intel Math Kernel Library (MKL) is a library for numerical operations tuned for Intel processors. Specify -qmkl option to link this library.
This option accepts additional sub-options. -qmkl=sequtntial links the non-parallel library. -qmkl=parallel links the thread-parallelized library. -qmkl=cluster links the MPI-parallelized library, such as ScaLAPACK, FFT and direct sparse solvers. -qmkl without sub-option means -qmkl=parallel.

MKL has the following functions:

BLAS, LAPACK, ScaLAPACK, BLACS, PBLAS, Sparse BLAS, sparse matrix arithmetics (includes PARDISO), Fourier Transform, partial differential equations, non-linear optimization solvers, data-fitting functions, GNU multiple precision (GMP) functions, vectorized mathematical library (VML), statistical functions (including pseudo random number generator), etc.


Batch Job

On subsystems A / B of GENKAI, programs should be executed as Batch Jobs.

Refer to the following page for details about batch jobs:
Batch Job

Followings are examples of scripts for executing programs compiled by Intel Compiler as batch jobs.


Example 1) Serial Program
#!/bin/bash

#PJM -L rscgrp=a-batch
#PJM -L vnode-core=1
#PJM -L elapse=0:10:00
#PJM -j

module load intel

./a.out
  • 1 node, 1 process, 1 core(1 thread)
  • Maximum Execution Time: 10 minutes
  • Store both standard output and standard error into the same file
Example 2) Thread Parallel Program (OpenMP)
#!/bin/bash

#PJM -L rscgrp=a-batch
#PJM -L vnode-core=16
#PJM -L elapse=1:00:00
#PJM -j

module load intel
export OMP_NUM_THREADS=16

./a.out
  • 1 node, 1 process, 1 core(16 threads)
  • Maximum Execution Time: 1 hour

Example 3) Process Parallel Program (MPI)

#!/bin/bash

#PJM -L rscgrp=a-batch
#PJM -L vnode-core=16
#PJM --mpi proc=16
#PJM -L elapse=1:00:00
#PJM -j

module load intel
module load impi

mpiexec ./a.out
  • 1node, 16 processes

Example 4) Hybrid Parallel Programs (MPI + OpenMP)

#!/bin/bash

#PJM -L rscgrp=a-batch
#PJM -L vnode-core=32
#PJM --mpi proc=4
#PJM -L elapse=2:00:00
#PJM -j

module load intel
module load impi
export OMP_NUM_THREADS=8

mpiexec ./a.out
  • 1 node, 4 processes, 8 threads

Example 5) Process Parallel Program (MPI) with single node (exclusive use)

#!/bin/bash

#PJM -L rscgrp=a-batch
#PJM -L node=1
#PJM --mpi proc=32
#PJM -L elapse=2:00:00
#PJM -j

module load intel
module load impi

mpiexec ./a.out

Example 6) Process Parallel Program (MPI) with multiple nodes (exclusive use)

#!/bin/bash

#PJM -L rscgrp=a-batch
#PJM -L node=2
#PJM --mpi proc=120
#PJM -L elapse=2:00:00
#PJM -j

module load intel
module load impi

mpiexec ./a.out

References