<< Chapter < Page | Chapter >> Page > |
For FORTRAN programs, a library timing function found on many machines is called
etime , which takes a two-element
REAL*4
array as an argument and fills the slots with the user CPU time and system CPU time, respectively. The value returned by the function is the sum of the two. Here’s how
etime is often used:
real*4 tarray(2), etime
real*4 start, finishstart = etime(tarray)finish = etime(tarray)write (*,*) ’CPU time: ’, finish - start
Not every vendor supplies an etime function; in fact, one doesn’t provide a timing routine for FORTRAN at all. Try it first. If it shows up as an undefined symbol when the program is linked, you can use the following C routine. It provides the same functionality as etime :
#include<sys/times.h>#define TICKS 100.float etime (parts)
struct {float user;
float system;} *parts;
{struct tms local;
times (&local);
parts->user= (float) local.tms_utime/TICKS;
parts->system = (float) local.tms_stime/TICKS;
return (parts->user + parts->system);
}
There are a couple of things you might have to tweak to make it work. First of all, linking C routines with FORTRAN routines on your computer may require you to add an underscore
(_)
after the function name. This changes the entry to
float etime_ (parts)
. Furthermore, you might have to adjust the
TICKS
parameter. We assumed that the system clock had a resolution of 1/100 of a second (true for the Hewlett-Packard machines that this version of
etime was written for). 1/60 is very common. On an RS-6000 the number would be 1000. You may find the value in a file named
/usr/include/sys/param.h on your machine, or you can determine it empirically.
A C routine for retrieving the wall time using calling gettimeofday is shown below. It is suitable for use with either C or FORTRAN programs as it uses call-by-value parameter passing:
#include<stdio.h>#include<stdlib.h>#include<sys/time.h>void hpcwall(double *retval){
static long zsec = 0;static long zusec = 0;
double esec;struct timeval tp;
struct timezone tzp;gettimeofday(&tp,&tzp);if ( zsec == 0 ) zsec = tp.tv_sec;
if ( zusec == 0 ) zusec = tp.tv_usec;*retval = (tp.tv_sec - zsec) + (tp.tv_usec - zusec ) * 0.000001 ;}void hpcwall_(double *retval) { hpcwall(retval); } /* Other convention */
Given that you will often need both CPU and wall time, and you will be continu- ally computing the difference between successive calls to these routines, you may want to write a routine to return the elapsed wall and CPU time upon each call as follows:
SUBROUTINE HPCTIM(WTIME,CTIME)
IMPLICIT NONE*
REAL WTIME,CTIMECOMMON/HPCTIMC/CBEGIN,WBEGIN
REAL*8 CBEGIN,CEND,WBEGIN,WENDREAL ETIME,CSCRATCH(2)
*CALL HPCWALL(WEND)
CEND=ETIME(CSCRATCH)*
WTIME = WEND - WBEGINCTIME = CEND - CBEGIN
*WBEGIN = WEND
CBEGIN = CENDEND
You can get a lot information from the timing facilities on a UNIX machine. Not only can you tell how long it takes to perform a given job, but you can also get hints about whether the machine is operating efficiently, or whether there is some other problem that needs to be factored in, such as inadequate memory.
Once the program is running with all anomalies explained away, you can record the time as a baseline. If you are tuning, the baseline will be a reference with which you can tell how much (or little) tuning has improved things. If you are benchmarking, you can use the baseline to judge how much overall incremental performance a new machine will give you. But remember to watch the other figures — paging, CPU utilization, etc. These may differ from machine to machine for reasons unrelated to raw CPU performance. You want to be sure you are getting the full picture.
Notification Switch
Would you like to follow the 'High performance computing' conversation and receive update notifications?