Parallel Computing with MPICH2
MPI is a
library of routines that can be used to create parallel programs in C or
Fortran77. MPI is designed to allow
users to create programs that can run efficiently on most parallel
architectures. MPI can also support distributed program execution on
heterogeneous hardware. Such that we can start processes in multiple computer
systems to work on the same problem. In this report, we will explain the power
of parallel programming using a simple program that utilizes MPI.
About this program
This
program calculates sum and mean of random numbers between 0 and 1. It generates
1000 random numbers in each process and computes the sum and the mean at each
process. Then it integrates the sums and means across the processes and
calculates the overall sum and overall mean. This result is output to the
user. The parallel program executes on
several different machines. Here we have used several virtual machines.
Software
Used
MPICH2
MPICH2
MPICH is a freely available, portable implementation of MPI, a standard for message-passing for distributed-memory applications used in parallel computing.
NFS(Network File System)
NFS is short for Network File System. We used it to create a shared directory to share file easily across computers.
OpenSSH Server
We used OpenSSH to set up authentication among computers without having to enter passwords.
We used OpenSSH to set up authentication among computers without having to enter passwords.
NANO
Simple command line text editor for Linux.
Simple command line text editor for Linux.
Oracle VM Virtual Box
Ubuntu Linux 13.04
Program in
Detail
This program generates 1000 random
numbers between 0-1 in each process and calculates both sum and mean separately
for each process and finally reduce the result using MPI_SUM function and calculate
overall sum and overall mean.
Firstly, we need to import or link
necessary header files to run this parallel program written in C. So at the top we have included stdio.h (to
handle basic standard input and output), mpi.h (reference functions in the
MPI), math.h (to handle mathematical functions) and stdlib.h (to utilize
standard libraries in c).
MPI_Init(&argc, &argv)
MPI_Init(&argc, &argv)
This
function setup the basic MPI environment. Even though we don’t use the command
line parameters we can pass as arguments to this function.
MPI_Comm_size(MPI_COMM_WORLD,&nProcesses)
This function determines the size
of the communicator. Here we have the default communicator MPI_COMM_WORLD.
Number of processes will be stored in the variable nProcesses.
MPI_Comm_rank(MPI_COMM_WORLD, &id)
MPI_Comm_rank
function Determines the ranks of the calling process in the communicator. The
rank then will be stored in the variable id.
MPI_Reduce(&sum,&overall_sum,1,MPI_DOUBLE,MPI_SUM,0,MPI_COMM_WORLD)
This
function requires several parameters. The first argument &sum represents an
array of elements of given defined data type which is MPI_DOUBLE. In
overall_sum variable stores the final result of this function. We need to pass
the reducing function. Here we used MPI_SUM which nothing but getting the sum
of all the elements.
MPI_Reduce(&mean,&overall_mean,1,MPI_DOUBLE,MPI_SUM,0,MPI_COMM_WORLD)
This is
exactly the same function above described. Instead of Sum this function is used
to accumulate mean values.
The calculation part is quite
simple. Here each process loop over 1000 times and during each iteration it
generates a random number and adds it to the sum. Finally once the loop is
finished it computes the mean and store in variable mean.
Screen Shots
This screen depicts the master PC
(on top) and slave PC (bottom). We have setup MPI, NFS shared folder
and the openSSH among these
machines.
Master shows the directory listing
under the shared folder /mirror. As we can see we have compiled the
random_sum.c using MPICC compiler
and generate the output file called random_sum.
On the Slave, we have run htop for network analysis. Slave can access /mirror folder which is our
shared folder. Any file added by Master will be available to Slave. This eases
the process of copying and pasting each file across the machines.
Figure 02 |
Now we run the command to execute
the program across the machines.
mpiexec –f
hosts –n 5 ./rand_sum
|
hosts is a file that contains all the IP addresses of slaves and
master machines. –n is number of processes. Here we defined 5 processes.
Note that before executing this
command, slave doesn’t have any processes running with the name “random_sum”.
This screen shot was taken while the program execution. The
result shown in the console is a trial run. Ignore it for now.
Now while it is executing you can see the program is running
on the Slave machine. Note the Command column. It has random_sum. This shows the parallel processing. The program was
executed in the master. But it is running in parallel in the slave.
Following screen shot displays the final result.
Generated 50000 random numbers
Total sum: 24997.119693 and Mean: 0.499942
Ready
Generated 50000 random numbers
Total sum: 24997.119693 and Mean: 0.499942
Ready
So it is clear that the program ran on multiple machines in parallel.
Once each machine calculates sum and mean individually they have been accumulated by MPI_Reduce function and display the overall_sum and overall_mean.
Once each machine calculates sum and mean individually they have been accumulated by MPI_Reduce function and display the overall_sum and overall_mean.
This comment has been removed by the author.
ReplyDeleteThis comment has been removed by the author.
ReplyDelete