Posted onby admin
Skip to end of metadataGo to start of metadata

This article explains how to install mpi4py in your home directory on the cluster in such a way that it is possible to use different clusters and/or compilers and MPI libraries

The Problem

mpi4py needs to be built for a specific combination of the following:

  • Compiler
  • MPI flavour
  • CPU type
  • Interconnect
  • Python version

Standard for the Python programming language, allowing any Python program to exploit multiple processors. This package is constructed on top of the MPI-1/2/3 specifications and provides an object oriented interface which resembles the. MPI for Python (mpi4py) provides bindings of the Message Passing Interface (MPI) standard for the Python programming language, allowing any Python program to exploit multiple processors. This package is constructed on top of the MPI–1/MPI–2 specification and provides an object oriented interface which closely follows MPI–2 C bindings.

The following are 4 code examples for showing how to use mpi4py.MPI.Finalize.These examples are extracted from open source projects. You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. There's an old package of mine that is built on mpi4py which enables a functional parallel map for MPI jobs. It's not built for speed - it was built to enable a MPI parallel map from the interpreter onto a compute cluster (i.e. Without the need to run from the mpiexec from the command line).

This means that if installed with pip install --user mpi4py it will be tied to the exact combination of the aforementioned requirements used at install time.

In SCITAS clusters there is an environment variable set that captures the CPU type and the Interconnect: $SYS_TYPE .


The solution to this is to use the Python virtualenv package which allows multiple python environments to co-exist.

Step-by-step guide

Make a directory for all your virtualenv projects (optional)

Decide on your compiler/MPI and know where you are

For each combination we need to create a directory and so we choose the following naming convention:

$SYS_TYPE _ compiler _ MPI


Note that for Intel there is only one MPI available so ${SYS_TYPE}_intel will suffice as a name.

${SYS_TYPE} identifies the hardware type and looks something like x86_E5v4_Mellanox

Optionally, if one intends to use multiple Python versions you might want to include the version of Python, for example, you can append: _py3 .

Run virtualenv and install mpi4py

First check that the correct modules have been loaded!

Now create the virtualenv by pointing to use the appropriate directory and python version (2 or 3)

And change to the newly created virtual environment

Note that the prompt is changed as a reminder that the virtualenv is active.

Now we can install mpi4py using the '--no-cache-dir' option to make sure that it always get rebuilt correctly

To leave the virtual environment simply type deactivate .

Installing for another combination


We can repeat the above process for as many permutations as we need:

Using mpi4py


When you want to use mpi4py you now need to load the appropriate virtual environment as well as loading the corresponding modules:

Note how virtualenv knows which python is associated with the environment so simply typing python is sufficient.

The same applies for batch scripts - just source the virtualenv after loading your modules.

Launching MPI jobs

As with traditional MPI jobs you need to use srun to correctly launch the jobs:

Failure to use srun will result in only one rank being launched.


Notes for Deneb

MPI4PY with Intel Infiniband/Omnipath

By default MPI4PY is not fully compatible with the Interconnect on the Deneb cluster due to the use of some MPI 3.0 calls which are not supported.

The usual symptom is that communications between ranks will block or fail.

Please see the following page for the details:

The solution is to set the following variable just after importing MPI4PY

A more explicit example is:

When using Intel MPI (module load intel-mpi) it is also possible to work around the issue by changing the fabric protocol via 'export I_MPI_FABRICS=shm:ofa'. This is not possible for MVAPICH2 nor OpenMPI.

Different architectures and the GPU nodes

Deneb is a heterogeneous cluster with the following $SYS_TYPE


  • x86_E5v2_IntelIB
  • x86_E5v3_IntelIB
  • x86_E5v2_Mellanox_GPU

Distributed Parallel Programming In Python : MPI4PY

Mpi4py Examples

The first two are cross compatible for mpi4py but if using the GPU nodes you should change to the appropriate SYS_TYPE before configuring mpi4py