Running an MPI program in the department's server
To experiment with MPI programming, you have to access our cluster,
which has 7 operational nodes:
6 Dual-Core + 2GB DDR
1 Dual-Core + 1GB DDR
through a front-end machine:
ssh grid.alunos.dcc.fc.up.pt
Your area in the grid machine is different from your area in the labs,
you must use scp (secure-copy) to copy files from one area
to the other.
Once in the cluster front-end (grid machine), you can
access the cluster nodes. These nodes are named as follows
(grid-node7 is down!!!):
grid-node1
grid-node2
...
grid-node8
grid-node8 has only 1GB of main memory.
In order to be able to run processes in the various nodes without the
system asking for a password for each machine, you need to generate an
authenticated key.
- generate an authentication key by typing in the command
ssh-keygen
This will produce two files in the .ssh directory:
id_rsa with your private-key, and
id_rsa.pub with your public key.
- copy your public key to the authorization_keys file
with
ssh-copy-id -i ~/.ssh/id_rsa.pub grid-node1
- verify if it is working as supposed. The command
ssh
grid-node1
should log you in at grid-node1 without needing to type in
the password.
Open-MPI:
The cluster already has installed the open-mpi
distribution. Some relevant info is:
Your task
After getting acquainted with the machine grid.alunos.dcc.fc.up.pt and
having copied your C program to your homedir in this machine, you will
be able to run an example that can be found here:
http://en.wikipedia.org/wiki/Message_Passing_Interface#Example_program
This is a very simple MPI program that only implements communication
between processes.
Run this program with 1 to 7 processes without giving the option
-machinefile to mpirun. Use the unix command "time" before mpirun
to see the execution time.
Run this program with 1 to 7 processes giving the option
-machinefile to mpirun. Use the unix command "time" before mpirun
to see the execution time.
Are there any differences among the execution times? What is your
conclusion?
While the processes are running, try killing one of them. What
happens?
Modify this program to make the processes stay looping for a
number of iterations (introduce a dummy loop that is executed by all
children processes after they receive a message from the parent
process). Repeat the previous experiments and run again your
modified program. Did you notice any
difference in execution times?