Grid computing on DAS-2

Running an Intel Fortran MPI job on multiple DAS-2 clusters

For a general introduction to running Grid jobs on DAS-2, please refer to the DAS-2 grid information page. Here we will show a similar example, only using the Intel Fortran compiler.

Step 1: Make sure the special MPICH-G2 (Globus) on MPICH-GM (Myrinet) configured for the Intel compilers is in your path before other MPICH-based implementations

[versto@fs0 MPI]$ which mpif90
[versto@fs0 MPI]$ PATH=/usr/local/mpich/mpich-g2-gm-intel/bin:$PATH
[versto@fs0 MPI]$ which mpif90
[versto@fs0 MPI]$ . /usr/local/intel/compiler81/bin/
[versto@fs0 MPI]$ which ifort

Note: This version of MPICH-G2 (mpich-g2-gm-intel) uses Myrinet-based communication within the local clusters and IP/sockets-based communication between the clusters. If for some reason (e.g., performance comparisons) you want to use IP-based communication within the clusters as well, you can instead use MPICH-G2 version mpich-g2-ip-intel.

Step 2: Compile the code with the MPICH-G2/Intel mpif90

[versto@fs0 MPI]$ cat pi.f

c   pi.f - compute pi by integrating f(x) = 4/(1 + x**2)     
c   Each node: 
c    1) receives the number of rectangles used in the approximation.
c    2) calculates the areas of it's rectangles.
c    3) Synchronizes for a global summation.
c   Node 0 prints the result.
c  Variables:
c    pi  the calculated result
c    n   number of points of integration.  
c    x           midpoint of each rectangle's interval
c    f           function to integrate
c    sum,pi      area of rectangles
c    tmp         temporary scratch space for global summation
c    i           do loop index
      program main

      include 'mpif.h'

      double precision  PI25DT
      parameter        (PI25DT = 3.141592653589793238462643d0)

      double precision  mypi, pi, h, sum, x, f, a
      integer n, myid, numprocs, i, rc
      character(len=MPI_MAX_PROCESSOR_NAME):: name=''
      integer namelen
      real  start, finish, totalwalltime, traperror

c                                 function to integrate
      f(a) = 4.d0 / (1.d0 + a*a)

      call MPI_INIT( ierr )
      call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr )
      call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr )
      call MPI_GET_PROCESSOR_NAME(name,namelen, ierr)
c     print *, 'Process ', myid, ' of ', numprocs, ' is alive'
      write(6,5) myid, numprocs, name
      call flush(6)
 5    format('Process ', i2, ' of ', i2, ' on ', a32)

      sizetype   = 1
      sumtype    = 2
      n          = 0
 10   if ( myid .eq. 0 ) then
c        write(6,98)
c98      format('Enter the number of intervals: (0 quits)')
c        read(5,99) n
c99      format(i10)
        if ( n .eq. 0 ) then
            n = 100
            n = 0

c                                 check for quit signal
      if ( n .le. 0 ) goto 30

c                                 calculate the interval size
      h = 1.0d0/n

      sum  = 0.0d0
      do 20 i = myid+1, n, numprocs
         x = h * (dble(i) - 0.5d0)
         sum = sum + f(x)
 20   continue
      mypi = h * sum

c                                 collect all the partial sums
     $     MPI_COMM_WORLD,ierr)

c                                 node 0 prints the answer.
      if (myid .eq. 0) then
         write(6, 97) pi, abs(pi - PI25DT)
 97      format('  pi is approximately: ', F18.16,
     +          '  Error is: ', F18.16)

      goto 10

 30   call MPI_FINALIZE(rc)

[versto@fs0 MPI]$ mpif90 -o fpi_globus fpi.f
   program MAIN

Warning 2 at (248:mpif.h) : Type size specifiers are an extension to standard Fortran 95

      parameter        (PI25DT = 3.141592653589793238462643d0)
Warning 101 at (25:fpi.f) : Constant truncated -- precision too great

      f(a) = 4.d0 / (1.d0 + a*a)
Comment 18 at (34:fpi.f) : The statement function is obsolescent in Fortran 95

         sum = sum + f(x)
Comment 18 at (72:fpi.f) : The statement function is obsolescent in Fortran 95

358 Lines Compiled

Step 3: Create a "machines" file that specifies the cpus to be used

[versto@fs0 MPI]$ cat machines
"" 4
"" 4

Step 4: Transfer the binary to the other DAS-2 sites used, in the same directory

[versto@fs0 MPI]$ rsync -e ssh -avz fpi_globus fs1:`pwd`/
building file list ... done
wrote 283542 bytes  read 36 bytes  567156.00 bytes/sec
total size is 1053318  speedup is 3.71

Step 5: Create a Globus "RSL" file, based on the "machines" file

NOTE: The "sed" command below is used to modify the environment variable LD_LIBRARY_PATH, so that the proper shared libraries of Globus and the Intel compiler can be found by the processes.

[versto@fs0 MPI]$ mpirun -dumprsl -np 8 fpi_globus arg1 arg2 | sed -e 's!LD_LIBRARY_PATH !LD_LIBRARY_PATH /usr/local/intel/compiler81/lib:!' >fpi_globus.rsl
[versto@fs0 MPI]$ cat fpi_globus.rsl 
( &(resourceManagerContact="") 
   (label="subjob 0")
   (environment=(GLOBUS_DUROC_SUBJOB_INDEX 0)
                (LD_LIBRARY_PATH /usr/local/intel/compiler81/lib:/usr/local/globus/globus-3.2/lib/))
   (arguments= "arg1" "arg2")
( &(resourceManagerContact="") 
   (label="subjob 4")
   (environment=(GLOBUS_DUROC_SUBJOB_INDEX 1)
                (LD_LIBRARY_PATH /usr/local/intel/compiler81/lib:/usr/local/globus/globus-3.2/lib/))
   (arguments= "arg1" "arg2")

Step 6: Run the job

[versto@fs0 MPI]$ mpirun -globusrsl fpi_globus.rsl 
Process  4 of  8 on           
Process  5 of  8 on           
Process  6 of  8 on           
Process  7 of  8 on           
Process  1 of  8 on           
Process  0 of  8 on           
Process  2 of  8 on           
Process  3 of  8 on           
  pi is approximately: 3.1416009869231245  Error is: 0.0000083333333314

Advanced School for Computing and Imaging

Back to the DAS-2 home page
This page is maintained by Kees Verstoep. Last modified: Fri Jan 27 20:27:31 CET 2006