Parallel Computing with MPICH2

May 21, 2014

MPI is a library of routines that can be used to create parallel programs in C or Fortran77. MPI is designed to allow users to create programs that can run efficiently on most parallel architectures. MPI can also support distributed program execution on heterogeneous hardware. Such that we can start processes in multiple computer systems to work on the same problem. In this report, we will explain the power of parallel programming using a simple program that utilizes MPI.

About this program

This program calculates sum and mean of random numbers between 0 and 1. It generates 1000 random numbers in each process and computes the sum and the mean at each process. Then it integrates the sums and means across the processes and calculates the overall sum and overall mean. This result is output to the user. The parallel program executes on several different machines. Here we have used several virtual machines.

Software Used

MPICH2

MPICH is a freely available, portable implementation of MPI, a standard for message-passing for distributed-memory applications used in parallel computing.

NFS(Network File System)

NFS is short for Network File System. We used it to create a shared directory to share file easily across computers.

OpenSSH Server

We used OpenSSH to set up authentication among computers without having to enter passwords.

NANO

Simple command line text editor for Linux.

Oracle VM Virtual Box

Ubuntu Linux 13.04

Program in Detail

This program generates 1000 random numbers between 0-1 in each process and calculates both sum and mean separately for each process and finally reduce the result using MPI_SUM function and calculate overall sum and overall mean.

Firstly, we need to import or link necessary header files to run this parallel program written in C. So at the top we have included stdio.h (to handle basic standard input and output), mpi.h (reference functions in the MPI), math.h (to handle mathematical functions) and stdlib.h (to utilize standard libraries in c).

MPI_Init(&argc, &argv)

This function setup the basic MPI environment. Even though we don’t use the command line parameters we can pass as arguments to this function.

MPI_Comm_size(MPI_COMM_WORLD,&nProcesses)

This function determines the size of the communicator. Here we have the default communicator MPI_COMM_WORLD. Number of processes will be stored in the variable nProcesses.

MPI_Comm_rank(MPI_COMM_WORLD, &id)

MPI_Comm_rank function Determines the ranks of the calling process in the communicator. The rank then will be stored in the variable id.

MPI_Reduce(&sum,&overall_sum,1,MPI_DOUBLE,MPI_SUM,0,MPI_COMM_WORLD)

This function requires several parameters. The first argument &sum represents an array of elements of given defined data type which is MPI_DOUBLE. In overall_sum variable stores the final result of this function. We need to pass the reducing function. Here we used MPI_SUM which nothing but getting the sum of all the elements.

MPI_Reduce(&mean,&overall_mean,1,MPI_DOUBLE,MPI_SUM,0,MPI_COMM_WORLD)

This is exactly the same function above described. Instead of Sum this function is used to accumulate mean values.

The calculation part is quite simple. Here each process loop over 1000 times and during each iteration it generates a random number and adds it to the sum. Finally once the loop is finished it computes the mean and store in variable mean.

Screen Shots

This screen depicts the master PC (on top) and slave PC (bottom). We have setup MPI, NFS shared folder and the openSSH among these machines.

Master shows the directory listing under the shared folder /mirror. As we can see we have compiled the random_sum.c using MPICC compiler and generate the output file called random_sum.

On the Slave, we have run htop for network analysis. Slave can access /mirror folder which is our shared folder. Any file added by Master will be available to Slave. This eases the process of copying and pasting each file across the machines.

Figure 02

Now we run the command to execute the program across the machines.

mpiexec –f hosts –n 5 ./rand_sum

hosts is a file that contains all the IP addresses of slaves and master machines. –n is number of processes. Here we defined 5 processes.

Note that before executing this command, slave doesn’t have any processes running with the name “random_sum”.

Figure 03

This screen shot was taken while the program execution. The result shown in the console is a trial run. Ignore it for now.

Now while it is executing you can see the program is running on the Slave machine. Note the Command column. It has random_sum. This shows the parallel processing. The program was executed in the master. But it is running in parallel in the slave.

Figure 04

Following screen shot displays the final result.

Generated 50000 random numbers
Total sum: 24997.119693 and Mean: 0.499942
Ready

Figure 05

So it is clear that the program ran on multiple machines in parallel.

Once each machine calculates sum and mean individually they have been accumulated by MPI_Reduce function and display the overall_sum and overall_mean.

Comments

AnonymousJuly 12, 2019 at 6:30 AM
This comment has been removed by the author.
ReplyDelete
Replies
AnonymousJuly 17, 2019 at 6:21 AM
This comment has been removed by the author.
ReplyDelete
Replies

Add comment

Search This Blog

Manoj Fernando's Blog

Parallel Computing with MPICH2

Comments

Post a Comment

Popular posts from this blog

Dependency Injection in ASP.NET MVC 5 with StructureMap

Real-time web with Node.js & Socket.io

PDF direct download with jQuery