GridMPI is an open-source free-software that implements the standard MPI library, designed for use on grid systems. It provides a mechanism for compilers to produce efficient code for distributed-memory parallel computers.
Simple experiments have shown that most MPI benchmarks scale fine for up to 20 milliseconds round-trip latency, which corresponds to about 500 miles in distance when clusters are connected by fast 1 to 10 Gbps networks. This distance covers major cities between Tokyo and Osaka in Japan. Therefore, applications that are too large to run on a local cluster can run on multiple clusters in the Grid environment with acceptable performance. However, this is only feasible when using an efficient MPI implementation.
Existing implementations are not efficient enough mainly because of their focus on security features and TCP performance problems. GridMPI skips security layers and focuses on performance on TCP since the institutes housing large clusters tend to have their own networks and dedicated secure links to connect to other institutes. Existing implementations are also not designed for Ethernet-based clusters and do not have optimal TCP performance.
GridMPI is designed and implemented from scratch with careful coding and testing with heterogeneity in mind. It is fully conformance to the standard, passes 100% of the functional tests of the large test suites from ANL and Intel (MPI-1.2 level), and supports full heterogeneity. It is also efficient with sockets and suitable for the Grid as well as ordinary Ethernet-based clusters.
GridMPI can be used with Globus, Unicore, tool from NAREGI project, etc. It supports checkpointing on Linux/IA32 platforms to restart long-running applications from failure and vendor MPI support for IBM-MPI, Fujitsu-Solaris-MPI, Intel-MPI, and any MPICH-based MPI for clusters with special communication hardware.
In the latest release, mpirun has been fixed to submit multiple jobs on a private address cluster and to invoke the GridVM launcher. It adds the environment variable _YAMPI_SOMAXCONN to specify maximum cluster size and includes minor bug fixes. Overall, GridMPI is an efficient and effective implementation of the standard MPI library for the Grid environment.
Version 2.1.1: N/A