<< Chapter < Page Chapter >> Page >

* Begin running the time steps DO TICK=1,MAXTIME* Set the heat sourcesBLACK(ROWS/3, COLS/3)= 10.0 BLACK(2*ROWS/3, COLS/3) = 20.0BLACK(ROWS/3, 2*COLS/3) = -20.0 BLACK(2*ROWS/3, 2*COLS/3) = 20.0

Now we broadcast the entire array from process rank zero to all of the other processes in the MPI_COMM_WORLD communicator. Note that this call does the sending on rank zero process and receiving on the other processes. The net result of this call is that all the processes have the values formerly in the master process in a single call:


* Broadcast the array CALL MPI_BCAST(BLACK,(ROWS+2)*(COLS+2),MPI_DOUBLE_PRECISION,+ 0,MPI_COMM_WORLD,IERR)

Now we perform the subset computation on each process. Note that we are using global coordinates because the array has the same shape on each of the processes. All we need to do is make sure we set up our particular strip of columns according to S and E:


* Perform the flow on our subsetDO C=S,E DO R=1,ROWSRED(R,C) = ( BLACK(R,C) + + BLACK(R,C-1) + BLACK(R-1,C) ++ BLACK(R+1,C) + BLACK(R,C+1) ) / 5.0 ENDDOENDDO

Now we need to gather the appropriate strips from the processes into the appropriate strip in the master array for rebroadcast in the next time step. We could change the loop in the master to receive the messages in any order and check the STATUS variable to see which strip it received:


* Gather back up into the BLACK array in master (INUM = 0) IF ( INUM .EQ. 0 ) THENDO C=S,E DO R=1,ROWSBLACK(R,C) = RED(R,C) ENDDOENDDO DO I=1,NPROC-1CALL MPE_DECOMP1D(COLS, NPROC, I, LS, LE, IERR) MYLEN = ( LE - LS ) + 1SRC = I TAG = 0 CALL MPI_RECV(BLACK(0,LS),MYLEN*(ROWS+2),+ MPI_DOUBLE_PRECISION, SRC, TAG, + MPI_COMM_WORLD, STATUS, IERR)* Print *,’Recv’,I,MYLEN ENDDOELSE MYLEN = ( E - S ) + 1DEST = 0 TAG = 0CALL MPI_SEND(RED(0,S),MYLEN*(ROWS+2),MPI_DOUBLE_PRECISION, + DEST, TAG, MPI_COMM_WORLD, IERR)Print *,’Send’,INUM,MYLEN ENDIFENDDO

We use MPE_DECOMP1D to determine which strip we’re receiving from each process.

In some applications, the value that must be gathered is a sum or another single value. To accomplish this, you can use one of the MPI reduction routines that coalesce a set of distributed values into a single value using a single call.

Again at the end, we dump out the data for testing. However, since it has all been gathered back onto the master process, we only need to dump it on one process:


* Dump out data for verification IF ( INUM .EQ.0 .AND. ROWS .LE. 20 ) THENFNAME = ’/tmp/mheatout’ OPEN(UNIT=9,NAME=FNAME,FORM=’formatted’)DO C=1,COLS WRITE(9,100)(BLACK(R,C),R=1,ROWS)100 FORMAT(20F12.6) ENDDOCLOSE(UNIT=9) ENDIFCALL MPI_FINALIZE(IERR)END

When this program executes with four processes, it produces the following output:


% mpif77 -c mheat.f mheat.f:MAIN mheat: % mpif77 -o mheat mheat.o -lmpe% mheat -np 4 Calling MPI_INITMy Share 1 4 51 100 My Share 0 4 1 50My Share 3 4 151 200 My Share 2 4 101 150%

The ranks of the processes and the subsets of the computations for each process are shown in the output.

So that is a somewhat contrived example of the broadcast/gather approach to parallelizing an application. If the data structures are the right size and the amount of computation relative to communication is appropriate, this can be a very effective approach that may require the smallest number of code modifications compared to a single-processor version of the code.

Mpi summary

Whether you chose PVM or MPI depends on which library the vendor of your system prefers. Sometimes MPI is the better choice because it contains the newest features, such as support for hardware-supported multicast or broadcast, that can significantly improve the overall performance of a scatter-gather application.

A good text on MPI is Using MPI — Portable Parallel Programming with the Message-Passing Interface , by William Gropp, Ewing Lusk, and Anthony Skjellum (MIT Press). You may also want to retrieve and print the MPI specification from (External Link) .

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, High performance computing. OpenStax CNX. Aug 25, 2010 Download for free at http://cnx.org/content/col11136/1.5
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'High performance computing' conversation and receive update notifications?

Ask