Running Wide-Area programs over the DAS

Panda has builtin support for running parallel programs over the wide-area DAS machine. Most programming libraries that are supported on DAS use Panda as their basic layer, so these all support Wide-Area programming.

Architecture

Panda assumes that one node in each cluster is dedicated to service wide-area forwarding tasks: the gateway. The gateway is part of your parallel program, so you must reserve one extra node in each cluster that cannot be used for application work. This way, the gateway is connected by Myrinet to the local processors. It services wide-area communication over TCP or over Myrinet with tunable delays.

Transparency

The Wide Area architecture is largely transparent for the application. Programmers may query the wide-area layout by calling the Panda functions pan_cluster_of(cpu) and pan_cluster_nr(), see the Panda docs (soon to be included). The nodes are (re)numbered so that the gateway processors receive processor numbers above the application maximum. As a rule, libraries do not start the application on the gateway processes. However, if you program directly on top of Panda, see the section "Transparency issue for Panda", below.

Simulator

It is a good idea to first run your Wide-Area program on the DAS simulator. It provides switches to simulate wide-aray delays: latency and maximum bandwidth can be separately tuned.

These are run-time switches that may be provided to the application:

To create a binary with Wide-Area simulator support, provide a flag -pkt-cluster to your compile script (panc, mpicc, ...).

Architecture specification

If there are n simulated clusters of equal size, specify -pan-cld n to the applications. If clusters must be of different size, contact the maintainer of Panda for information. Remember that you have to reserve n additional hosts for gateways.

Real Wide-Area runs

There is no intelligent scheduler that reserves and runs parallel applications at different clusters. The user must simultaneously execute a prun run on each cluster by hand.

Compilation flags

To enable cluster support in Panda, provide the flag -tcp-cluster to your compile script (panc, mpicc, ...).

There is no longer any need to compile the gateway binaries with different flags. The normal application executable will run the gateway modules if it deduces that it is a gateway (also see Section Transparency issue for Panda programs, below).

Architecture

If there are n clusters of equal size, specify -pan-clst-d my_cluster n total_hosts to the applications. If clusters must be of different size, contact the maintainer of Panda for information. Remember that you have to reserve n additional hosts for gateways.

The application uses a system server process that runs on the file server to synchronise. By default this is the local file server. For the Wide Area, this must be overriden. Specify -das-sync-server das0fs.cs.vu.nl to your application. To synchronize, all processes must use an identical key. This key is corrupted whenever your Wide-Area program does not terminate regularly: especially, this means that it is corrupted when startup fails on one or more clusters. Provide a key to your applications with -pan-cluster-key key. A good key is for instance "your-name:your-application:counter".

Flow control over the shared TCP link must be handled specially: basically, the link between each node and its cluster's gateway must be as fat as the total link capacity towards all off-cluster hosts. Usually, this means that the local link capacity must be reduced. Specify to your application: -pan-sys-credits c -pan-gateway-credits g, where c is chosen by you so that the message receive buffer space at any node does not exceed pinned memory (8000 credits currently is the limit), and g = (n - 1) * c * nodes-per-cluster.

Example MPI program

Compile two binaries with Run on each cluster i of 4 clusters, each with 6+1 hosts:

prun -gateway a.out.ot a.out 7 -pan-clst-d i 4 28 -das-sync-server das0fs.cs.vu.nl -pan-sys-credits 16 -pan-gateway-credits 288 -pan-cluster-key "rutger:mpi-test:33" [user-application-options]

Example Orca program

Compile with Run on each cluster i of 4 clusters, each with 6+1 hosts:

prun a.out 7 -pan-clst-d i 4 28 -das-sync-server das0fs.cs.vu.nl -pan-sys-credits 16 -pan-gateway-credits 288 -pan-cluster-key "rutger:orca-test:33" [user-application-options]


Transparency issue for Panda programs

If you program directly on top of Panda (without using a standard library/language like Orca, MPI, Manta, PVM) transparancy is not complete: an application is also started on the gateway machines. On the gateways, your application must terminate immediately in the following fashion. The application must initialize Panda in the usual way (by calling pan_init(), pan_mp_init() etc, and pan_start()); then, if your processor number pan_my_pid() is not less than the number of application processors pan_nr_processes(), this is a gateway. You must immediately terminate the panda modules by calling pan_mp_end() etc, then pan_end(). After that, no application code may be run on the gateway machines.
Advanced School for
Computing and Imaging

Back to the DAS home page
This page is maintained by Rutger Hofman at the VU Amsterdam. Last modified: Thu Sep 23 15:11:23 CEST 1999