/lib/dld.sl: Bind-on-reference call failed /lib/dld.sl: Invalid argument(This example is from HP-UX).
ld.so: libc.so.2: not found(This example is from SunOS 4.1; similar things happen on other systems).
A: The problem here is that your program is using shared libraries, and the libraries are not available on some of the machines that you are running on. To fix this, relink your program without the shared libraries. To do this, add the appropriate command-line options to the link step. For example, for the HP system that produced the errors above, the fix is to use -Wl,-Bimmediate to the link step. For SunOS, the appropriate option is -Bstatic.
A: We have seen this problem with installations using AFS. The remote shell program, rsh, supplied with some AFS systems seems to limit the number of jobs that can use standard output. This seems to prevent some of the processes from exiting as well, causing the job to hang. There are four possible fixes:
2. Use the secure server (serv_p4). See the discussion in the Users Guide.
3. Redirect all standard output to a file. The MPE routine MPE_IO_Stdout_to_file may be used to do this.
4. Get a fixed rsh command. The likely source of the problem is an incorrect usage of the select system call in the rsh command. If the code is doing something like
int mask; mask |= 1 << fd; select( fd+1, &mask, ... );instead of
fd_set mask; FD_SET(fd,&mask); select( fd+1, &mask, ... );then the code is incorrect (the select call changed to allow more than 32 file descriptors many years ago, and the rsh program (or programmer!) hasn't changed with the times).
2. Q: Not all processes start.
A: This can happen when using the ch_p4 device and a system that has extremely small limits on the number of remote shells you can have. Some systems using ``Kerberos'' (a network security package) allow only three or four remote shells; on these systems, the size of MPI_COMM_WORLD will be limited to the same number (plus one if you are using the local host).
The only way around this is to try the secure server; this is documented in the mpich installation guide. Note that you will have to start the servers ``by hand'' since the chp4_servs script uses remote shell to start the servers.