The MPI profiling interface provides a convenient way for you to add performance analysis tools to any MPI implementation. We demonstrate this mechanism in mpich, and give you a running start, by supplying three profiling libraries with the mpich distribution.
The first profiling library is simple. The profiling version of each MPI_Xxx routine calls PMPI_Wtime (which delivers a time stamp) before and after each call to the corresponding PMPI_Xxx routine. Times are accumulated in each process and written out, one file per process, in the profiling version of MPI_Finalize. The files are then available for use in either a global or process-by-process report. This version does not take into account nested calls, which occur when MPI_Bcast, for instance, is implemented in terms of MPI_Send and MPI_Recv.
The second profiling library generates logfiles, which are files of timestamped events. During execution, calls to MPI_Log_event are made to store events of certain types in memory, and these memory buffers are collected and merged in parallel during MPI_Finalize. During execution, MPI_Pcontrol can be used to suspend and restart logging operations. You can analyze the logfile produced at the end with a variety of tools. One that we use is called Upshot, which is a derivative of Upshot , written in Tcl/Tk. A screen dump of Upshot in use is shown in Figure 1 .
Figure 1: A screendump from upshot
It shows parallel time lines with process states, like one of the paraGraph . The view can be zoomed in or out, horizontally or vertically, centered on any point in the display chosen with the mouse. In Figure 1 , the middle window has resulted from zooming in on the upper window at a chosen point to show more detail. The window at the bottom of the screen show a histogram of state durations, with several adjustable parameters.
The third library does a simple form of real-time program animation. The MPE graphics library contains routines that allow a set of processes to share an X display that is not associated with any one specific process. Our prototype uses this capability to draw arrows that represent message traffic as the program runs.