11.1 Parallel virtual machine (Page 6/6)

High performance computing Page 6 / 6

This next segment is the easy part. All the appropriate ghost values are in place, so we must simply perform the computation in our subspace. At the end, we copy back from the RED to the BLACK array; in a real simulation, we would perform two time steps, one from BLACK to RED and the other from RED to BLACK , to save this extra copy:



* Perform the flow
              DO C=1,MYLENDO R=1,ROWS
                  RED(R,C) = ( BLACK(R,C) ++                   BLACK(R,C-1) + BLACK(R-1,C) +
           +                   BLACK(R+1,C) + BLACK(R,C+1) ) / 5.0ENDDO 
              ENDDO* Copy back - Normally we would do a red and black version of the loopDO C=1,MYLEN 
                DO R=1,ROWSBLACK(R,C) = RED(R,C) 
                ENDDOENDDO
            ENDDO

Now we find the center cell and send to the master process (if necessary) so it can be printed out. We also dump out the data into files for debugging or later visualization of the results. Each file is made unique by appending the instance number to the filename. Then the program terminates:



CALL SENDCELL(RED,ROWS,COLS,OFFSET,MYLEN,INUM,TIDS(0),
           +       ROWS/2,TOTCOLS/2)* Dump out data for verificationIF ( ROWS .LE. 20 ) THEN
              FNAME = ’/tmp/pheatout.’ // CHAR(ICHAR(’0’)+INUM)OPEN(UNIT=9,NAME=FNAME,FORM=’formatted’)
              DO C=1,MYLENWRITE(9,100)(BLACK(R,C),R=1,ROWS)
      100        FORMAT(20F12.6)ENDDO
             CLOSE(UNIT=9)ENDIF
      * Lets all go togetherCALL PVMFBARRIER( ’pheat’, NPROC, INFO )
            CALL PVMFEXIT( INFO )END

The SENDCELL routine finds a particular cell and prints it out on the master process. This routine is called in an SPMD style: all the processes enter this routine although all not at precisely the same time. Depending on the INUM and the cell that we are looking for, each process may do something different.

If the cell in question is in the master process, and we are the master process, print it out. All other processes do nothing. If the cell in question is stored in another process, the process with the cell sends it to the master processes. The master process receives the value and prints it out. All the other processes do nothing.

This is a simple example of the typical style of SPMD code. All the processes execute the code at roughly the same time, but, based on information local to each process, the actions performed by different processes may be quite different:



SUBROUTINE SENDCELL(RED,ROWS,COLS,OFFSET,MYLEN,INUM,PTID,R,C)
            INCLUDE ’../include/fpvm3.h’INTEGER ROWS,COLS,OFFSET,MYLEN,INUM,PTID,R,C
            REAL*8 RED(0:ROWS+1,0:COLS+1)REAL*8 CENTER* Compute local row number to determine if it is ours
            I = C - OFFSETIF ( I .GE. 1 .AND. I.LE. MYLEN ) THEN
              IF ( INUM .EQ. 0 ) THENPRINT *,’Master has’, RED(R,I), R, C, I
              ELSECALL PVMFINITSEND(PVMDEFAULT,TRUE)
                CALL PVMFPACK( REAL8, RED(R,I), 1, 1, INFO )PRINT *, ’INUM:’,INUM,’ Returning’,R,C,RED(R,I),I
                CALL PVMFSEND( PTID, 3, INFO )ENDIF
            ELSEIF ( INUM .EQ. 0 ) THEN
                CALL PVMFRECV( -1 , 3, BUFID )CALL PVMFUNPACK ( REAL8, CENTER, 1, 1, INFO)
                PRINT *, ’Master Received’,R,C,CENTERENDIF
            ENDIFRETURN
            END

Like the previous routine, the STORE routine is executed on all processes. The idea is to store a value into a global row and column position. First, we must determine if the cell is even in our process. If the cell is in our process, we must compute the local column (I) in our subset of the overall matrix and then store the value:



SUBROUTINE STORE(RED,ROWS,COLS,OFFSET,MYLEN,R,C,VALUE,INUM)
      REAL*8 RED(0:ROWS+1,0:COLS+1)REAL VALUE
      INTEGER ROWS,COLS,OFFSET,MYLEN,R,C,I,INUMI = C - OFFSET
      IF ( I .LT. 1 .OR. I .GT. MYLEN ) RETURNRED(R,I) = VALUE
      RETURNEND

When this program executes, it has the following output:



% pheat
       INUM: 0 Local 1 50 Global 1 50Master Received 100 100 3.4722390023541D-07
      %

We see two lines of print. The first line indicates the values that Process 0 used in its geometry computation. The second line is the output from the master process of the temperature at cell (100,100) after 200 time steps.

One interesting technique that is useful for debugging this type of program is to change the number of processes that are created. If the program is not quite moving its data properly, you usually get different results when different numbers of processes are used. If you look closely, the above code performs correctly with one process or 30 processes.

Notice that there is no barrier operation at the end of each time step. This is in contrast to the way parallel loops operate on shared uniform memory multiprocessors that force a barrier at the end of each loop. Because we have used an “owner computes” rule, and nothing is computed until all the required ghost data is received, there is no need for a barrier. The receipt of the messages with the proper ghost values allows a process to begin computing immediately without regard to what the other processes are currently doing.

This example can be used either as a framework for developing other grid-based computations, or as a good excuse to use HPF and appreciate the hard work that the HPF compiler developers have done. A well-done HPF implementation of this simulation should outperform the PVM implementation because HPF can make tighter optimizations. Unlike us, the HPF compiler doesn’t have to keep its generated code readable.

Pvm summary

PVM is a widely used tool because it affords portability across every architecture other than SIMD. Once the effort has been invested in making a code message passing, it tends to run well on many architectures.

The primary complaints about PVM include:

The need for a pack step separate from the send step
The fact that it is designed to work in a heterogeneous environment that may incur some overhead
It doesn’t automate common tasks such as geometry computations

But all in all, for a certain set of programmers, PVM is the tool to use. If you would like to learn more about PVM see PVM — A User’s Guide and Tutorial for Networked Parallel Computing , by Al Geist, Adam Beguelin, Jack Dongarra, Weicheng Jiang, Robert Manchek, and Vaidy Sunderam (MIT Press). Information is also available at www.netlib.org/pvm3/ .

<< Chapter < Page Page > Chapter >>

Read also:

Get Jobilize Job Search Mobile App in your pocket Now!

100% Free Mobile Applications
Receive real-time job alerts and never miss the right job again

Source: OpenStax, High performance computing. OpenStax CNX. Aug 25, 2010 Download for free at http://cnx.org/content/col11136/1.5

Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'High performance computing' conversation and receive update notifications?

Ask

	37 Biology 37 The Endocrine System MCQ By OpenStax Start Quiz
	Microbiology Practice Test By Sandhills MLT Start Test
	25 AP 25 Urinary System MCQ By OpenStax Start Quiz
	28 AP 28 Development Inheritance MCQ By OpenStax Start Quiz
	27 AP Key Terms 27 The Reproductive System By OpenStax Start Key Terms
	English Vocabulary By Jordon Humphreys Start Quiz
	7 BOD Urinary Tract quiz By Brooke Delaney Start Exam
©flickr: Tim	Hazard Quiz By Royalle Moore Start Quiz
	Principles of Marketing By Dionne Mahaffey Start Quiz
	13 Dr Garry GI Ruminants quiz By Brooke Delaney Start Exam