11.2 Message-passing interface (Page 7/7)

High performance computing Page 7 / 7



* Begin running the time steps
            DO TICK=1,MAXTIME* Set the heat sourcesBLACK(ROWS/3, COLS/3)= 10.0
              BLACK(2*ROWS/3, COLS/3) = 20.0BLACK(ROWS/3, 2*COLS/3) = -20.0
              BLACK(2*ROWS/3, 2*COLS/3) = 20.0

Now we broadcast the entire array from process rank zero to all of the other processes in the MPI_COMM_WORLD communicator. Note that this call does the sending on rank zero process and receiving on the other processes. The net result of this call is that all the processes have the values formerly in the master process in a single call:



* Broadcast the array
             CALL MPI_BCAST(BLACK,(ROWS+2)*(COLS+2),MPI_DOUBLE_PRECISION,+                 0,MPI_COMM_WORLD,IERR)

Now we perform the subset computation on each process. Note that we are using global coordinates because the array has the same shape on each of the processes. All we need to do is make sure we set up our particular strip of columns according to S and E:



* Perform the flow on our subsetDO C=S,E
                DO R=1,ROWSRED(R,C) = ( BLACK(R,C) +
           +                   BLACK(R,C-1) + BLACK(R-1,C) ++                   BLACK(R+1,C) + BLACK(R,C+1) ) / 5.0
                ENDDOENDDO

Now we need to gather the appropriate strips from the processes into the appropriate strip in the master array for rebroadcast in the next time step. We could change the loop in the master to receive the messages in any order and check the STATUS variable to see which strip it received:



* Gather back up into the BLACK array in master (INUM = 0) 
             IF ( INUM .EQ. 0 ) THENDO C=S,E
                 DO R=1,ROWSBLACK(R,C) = RED(R,C)
                 ENDDOENDDO
             DO I=1,NPROC-1CALL MPE_DECOMP1D(COLS, NPROC, I, LS, LE, IERR)
               MYLEN = ( LE - LS ) + 1SRC = I TAG = 0
               CALL MPI_RECV(BLACK(0,LS),MYLEN*(ROWS+2),+                     MPI_DOUBLE_PRECISION, SRC, TAG,
           +                     MPI_COMM_WORLD, STATUS, IERR)*         Print *,’Recv’,I,MYLEN
             ENDDOELSE
             MYLEN = ( E - S ) + 1DEST = 0
             TAG = 0CALL MPI_SEND(RED(0,S),MYLEN*(ROWS+2),MPI_DOUBLE_PRECISION,
           +                DEST, TAG, MPI_COMM_WORLD, IERR)Print *,’Send’,INUM,MYLEN
            ENDIFENDDO

We use MPE_DECOMP1D to determine which strip we’re receiving from each process.

In some applications, the value that must be gathered is a sum or another single value. To accomplish this, you can use one of the MPI reduction routines that coalesce a set of distributed values into a single value using a single call.

Again at the end, we dump out the data for testing. However, since it has all been gathered back onto the master process, we only need to dump it on one process:



* Dump out data for verification
            IF ( INUM .EQ.0 .AND. ROWS .LE. 20 ) THENFNAME = ’/tmp/mheatout’
              OPEN(UNIT=9,NAME=FNAME,FORM=’formatted’)DO C=1,COLS
                WRITE(9,100)(BLACK(R,C),R=1,ROWS)100        FORMAT(20F12.6)
              ENDDOCLOSE(UNIT=9)
            ENDIFCALL MPI_FINALIZE(IERR)END

When this program executes with four processes, it produces the following output:



% mpif77 -c mheat.f
      mheat.f:MAIN mheat:
      % mpif77 -o mheat mheat.o -lmpe% mheat -np 4
      Calling MPI_INITMy Share 1 4 51 100
      My Share 0 4 1 50My Share 3 4 151 200
      My Share 2 4 101 150%

The ranks of the processes and the subsets of the computations for each process are shown in the output.

So that is a somewhat contrived example of the broadcast/gather approach to parallelizing an application. If the data structures are the right size and the amount of computation relative to communication is appropriate, this can be a very effective approach that may require the smallest number of code modifications compared to a single-processor version of the code.

Mpi summary

Whether you chose PVM or MPI depends on which library the vendor of your system prefers. Sometimes MPI is the better choice because it contains the newest features, such as support for hardware-supported multicast or broadcast, that can significantly improve the overall performance of a scatter-gather application.

A good text on MPI is Using MPI — Portable Parallel Programming with the Message-Passing Interface , by William Gropp, Ewing Lusk, and Anthony Skjellum (MIT Press). You may also want to retrieve and print the MPI specification from (External Link) .

<< Chapter < Page Page > Chapter >>

Read also:

Get Jobilize Job Search Mobile App in your pocket Now!

100% Free Mobile Applications
Receive real-time job alerts and never miss the right job again

Source: OpenStax, High performance computing. OpenStax CNX. Aug 25, 2010 Download for free at http://cnx.org/content/col11136/1.5

Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'High performance computing' conversation and receive update notifications?

Ask

	39 Biology 39 The Respiratory System MCQ By OpenStax Start Quiz
	1 AP 01 Human Body Anatomy Physiology MCQ By OpenStax Start Quiz
	3 BOD GI quiz By Brooke Delaney Start Exam
	1 Understanding Societies 1 By Jessica Collett Start Exam
	Western Political Thought MCQ By Saylor Foundation Start Quiz
	9 Physiotherapy Modalities By Rhodes Start Exam
	6 Microbiology Midterm Practice By Madison Christian Start Test
	2 Psychology MCQ 2009 1 Exam By John Gabrieli Start Exam
	18 Dr. Dowers Endocrinology Quiz 2 By Brooke Delaney Start Exam
©flickr: BK	Self Confidence By Miranda Reising Start Quiz