Mpi4py
This article explains how to install mpi4py in your home directory on the cluster in such a way that it is possible to use different clusters and/or compilers and MPI libraries
The Problem
mpi4py needs to be built for a specific combination of the following:
- Compiler
- MPI flavour
- CPU type
- Interconnect
- Python version
Standard for the Python programming language, allowing any Python program to exploit multiple processors. This package is constructed on top of the MPI-1/2/3 specifications and provides an object oriented interface which resembles the. MPI for Python (mpi4py) provides bindings of the Message Passing Interface (MPI) standard for the Python programming language, allowing any Python program to exploit multiple processors. This package is constructed on top of the MPI–1/MPI–2 specification and provides an object oriented interface which closely follows MPI–2 C bindings.
The following are 4 code examples for showing how to use mpi4py.MPI.Finalize.These examples are extracted from open source projects. You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. There's an old package of mine that is built on mpi4py which enables a functional parallel map for MPI jobs. It's not built for speed - it was built to enable a MPI parallel map from the interpreter onto a compute cluster (i.e. Without the need to run from the mpiexec from the command line).
This means that if installed with pip install --user mpi4py
it will be tied to the exact combination of the aforementioned requirements used at install time.
In SCITAS clusters there is an environment variable set that captures the CPU type and the Interconnect: $SYS_TYPE
.

The solution to this is to use the Python virtualenv package which allows multiple python environments to co-exist.
Step-by-step guide
Make a directory for all your virtualenv projects (optional)
Decide on your compiler/MPI and know where you are
For each combination we need to create a directory and so we choose the following naming convention:
$SYS_TYPE _ compiler _ MPI
e.g.
Note that for Intel there is only one MPI available so ${SYS_TYPE}_intel
will suffice as a name.
${SYS_TYPE}
identifies the hardware type and looks something like x86_E5v4_Mellanox
Optionally, if one intends to use multiple Python versions you might want to include the version of Python, for example, you can append: _py3
.
Run virtualenv and install mpi4py
First check that the correct modules have been loaded!
Now create the virtualenv by pointing to use the appropriate directory and python version (2 or 3)
And change to the newly created virtual environment
Note that the prompt is changed as a reminder that the virtualenv is active.
Now we can install mpi4py using the '--no-cache-dir
' option to make sure that it always get rebuilt correctly
To leave the virtual environment simply type deactivate
.
Installing for another combination

We can repeat the above process for as many permutations as we need:
Using mpi4py
Mpi4py
When you want to use mpi4py you now need to load the appropriate virtual environment as well as loading the corresponding modules:
Note how virtualenv knows which python is associated with the environment so simply typing python
is sufficient.
The same applies for batch scripts - just source the virtualenv after loading your modules.
Launching MPI jobs
As with traditional MPI jobs you need to use srun to correctly launch the jobs:
Failure to use srun will result in only one rank being launched.

Notes for Deneb
MPI4PY with Intel Infiniband/Omnipath
By default MPI4PY is not fully compatible with the Interconnect on the Deneb cluster due to the use of some MPI 3.0 calls which are not supported.
The usual symptom is that communications between ranks will block or fail.
Please see the following page for the details: https://software.intel.com/en-us/articles/python-mpi4py-on-intel-true-scale-and-omni-path-clusters
The solution is to set the following variable just after importing MPI4PY
A more explicit example is:
When using Intel MPI (module load intel-mpi) it is also possible to work around the issue by changing the fabric protocol via 'export I_MPI_FABRICS=shm:ofa'.
This is not possible for MVAPICH2 nor OpenMPI.
Different architectures and the GPU nodes
Deneb is a heterogeneous cluster with the following $SYS_TYPE
Cached
- x86_E5v2_IntelIB
- x86_E5v3_IntelIB
- x86_E5v2_Mellanox_GPU
Distributed Parallel Programming In Python : MPI4PY
Mpi4py Examples
The first two are cross compatible for mpi4py but if using the GPU nodes you should change to the appropriate SYS_TYPE
before configuring mpi4py