How do I use MPICH2 on Argo?

MPI - Message Passing Interface - is a library specification the foundation of which is a group of functions that can be used in either Fortran or C, to achieve programming parallelism. The various functions permit processes on multiple compute nodes to talk to another by the exchange of messages.

The MPICH2 implementation is from Argonne National Laboratory and meets the MPI-2 standard. The mpiexec script, developed at the Ohio Supercomputing Center, allows you to run MPI programs without complex configurations and processing methods, resulting in a marked improvement over the previous method.

Configuring your environment

Make two changes: one to the PATH variable; the other, to the LD_LIBRARY_PATH. The former is mandatory whereas the latter is optional depending on the compiler you use. If your login shell is C, then the PATH change is made to the in the .cshrc file. If, on the other hand, you use bash, then the PATH variable in the .bash_profile file is altered. To see which shell is your default, enter:

echo $SHELL

If the output is /bin/bash, you are using the bash shell; if /bin/csh, then the C shell.

For a bash shell user, include the following to the .bash_profile file in your home directory:

export MPICH2_HOME=/usr/common/mpich2-1.0.7
export PATH=$MPICH2_HOME/bin:$PATH
export LD_LIBRARY_PATH=/usr/common/mpich2-1.0.7/lib:$LD_LIBRARY_PATH

For a C shell user, include the following to your .cshrc in your home directory:

setenv MPICH2_HOME /usr/common/mpich2-1.0.7
setenv PATH /usr/common/mpich2-1.0.7/bin:$PATH
setenv LD_LIBRARY_PATH /usr/common/mpich2-1.0.7/lib:$LD_LIBRARY_PATH

Note: If you do not use the GNU compilers (instead, you use the supported Portland Group compilers), then the change to the LD_LIBRARY_PATH is unncessary because you will specify the appropriate library path with the -L option in the compile statement.

Be sure to use mpiexec to run your programs

Compiling and Linking

The following scripts are available to compile and link your mpi programs:

Script Language
mpicc GNU C
mpicxx GNU C++
mpif77 GNU Fortran 77
mpif90 Portland Group Fortran 90

Each script will invoke the appropriate compiler.

Running

Executing an mpich2 job is now relatively straight forward.

  1. Prepare a script with the proper PBS directives as required for MPICH2
  2. Start your MPI program using the qsub command
  3. Review the results of your MPI program

mpiexec should not be run from the command line; instead put it into a script file. Below is an example script file, used to run a simple GNU Fortran 77 program.

#PBS -l walltime=5:00
#PBS -l cput=5:00
#PBS -l nodes=4
#PBS -q dedicated
cd ~NetID/a_directory_here
mpiexec mpi-program-name.out > output_from_program_here 2>&1

This script does a number of things:

  • The first line instructs the scheduler to allow the program no more than 5 minutes to execute
  • Line two limits total cpu time to no more than 5 minutes
  • Line three tells the scheduler to use 4 nodes
  • Line four specifies that the dedicated queue will be used for program execution
  • Line five explicitly switches to the location of the executable compiled above
  • The last line specifies both the executable name and the results file name.

Note that the program is run using mpiexec which automatically selects the execution nodes based on current resources available in the selected or defaulted queue!

Do not specify the nodes by name; identify the number of nodes and processors:

WRONG: #PBS -l nodes=argo1-1+argo1-2+argo1-3+argo1-4
RIGHT: #PBS -l nodes=4

The script is run from the command line, invoked by the command: qsub script_name

The IMPORTANT thing to remember is that the number of nodes available is based on the queue you selected for execution.

Once your script has completed - you can check job progress with the qstat command - you can look at the output file(s).

Advanced Job Control

The example script uses only one processor per compute node. What follows is a graphical representation of how that job looks, assuming the automatic scheduler decided to run the job on the four compute nodes argo1-1 through argo1-4.

mpich2-processes

However, argo is not a homogenous environment. Some nodes have one CPU while others have two CPUs.

There is no restriction that each node in the script use only a single processor. By changing the third line in the above script to read

#PBS -l nodes=4:ppn=2

Each node will use two processors. Once again the resource manager and the scheduler in combination will decide the optimal choice of nodes.

mpich2-nodesordered

For example purposes only, the processes and the nodes were ordered:

  • process p1 on the first CPU on argo1-1
  • process p2 on the second CPU on argo1-1
  • process p3 on the first CPU on argo1-3
  • and so on

The ordering is not guaranteed; which process goes to which node is not something you can dictate. It doesn't follow that the first process MUST go to the first CPU on the first node; the second process to the second CPU on the first node, and so on. The processes could just has easily been:

mpich2-nodesunordered

Help

If you receive an error which reads "Lamnodes Failed! Check if you had booted lam before calling mpiexec else use -machinefile to pass host file to mpiexec" you have not properly set the path as shown in the Configuring your environment section above.

Need help?

Last updated: 

September 20, 2016

Browse by tag