HPC industry leaders develop state-of-the-art network communication framework for next generation programming models


14 Jul 2015 Frankfurt - Mellanox Technologies has signed a collaboration with the Department of Energy’s Oak Ridge National Laboratory (ORNL), IBM, the University of Tennessee, NVIDIA, and industry leaders, laboratories and academia to develop a new open-source network communication framework for high-performance and data-centric applications.

Traditionally there have been three popular mainstream communication frameworks to support various interconnect technologies and programming languages: MXM, developed by Mellanox Technologies; PAMI, developed by IBM; and UCCS, developed by ORNL, the University of Houston, and the University of Tennessee. UCX will unify the strengths and capabilities of each of these communication libraries and optimize them into one unified communication framework that delivers essential building blocks for the development of a high-performance communication ecosystem.

"As we drive towards next generation, larger scale systems, the UCX project enables the research needed for emergent exascale programming models that are agnostic to the underlying interconnect and acceleration technology", stated Dr. Arthur Bernard Maccabe, division director, Computer Science and Mathematics Division, Oak Ridge National Laboratory.

"Mellanox is very happy to participate in the co-design efforts of the UCX project. By providing our advancements in shared memory, MPI and underlying network transport technologies, we can continue to advance open standards-based networking and programming models", stated Gilad Shainer, vice president of marketing at Mellanox Technologies. "UCX will provide optimizations for lower software overhead in communication paths that will allow cross platform near native-level interconnect performance. The framework interface will expose semantics that target not only HPC programming models, but data-centric applications as well. It will also enable vendor independent development of the library."

"UCX is clearly a strategic open-source communication framework for future high-performance systems", stated Jim Sexton, IBM Fellow and Director of Data Centric Systems at IBM. "We are eager to collaborate on UCX with our key OpenPOWER and university partners. In particular, IBM is contributing key innovations from our PAMI high-performance messaging software already in use in several Top10 supercomputing systems."

"UCX is intended to make it faster and easier to add Tesla Accelerated Computing Platform technologies, including GPUDirect RDMA and the NVLink high-speed interconnect, to the HPC communications stack", stated Duncan Poole, director of Platform Alliances at NVIDIA. "We look forward to working with the UCX members to bring new levels of high performance computing solutions to HPC."

"The path to Exascale, in addition to many other challenges, requires programming models where communications and computations unfold together, collaborating instead of competing for the underlying resources. In such an environment, providing holistic access to the hardware is a major component of any programming model or communication library. With UCX, we have the opportunity to provide not only a vehicle for production quality software, but also a low-level research infrastructure for more flexible and portable support for the Exascale-ready programming models", added George Bosilca, Research Director at the Innovative Computing Laboratory, University of Tennessee, Knoxville.

“By serving as a high-performance, low latency communication layer, UCX will enable us to provide applications developers with productive, extreme-scale programming languages and libraries, including Partitioned Global Address Space (PGAS) APIs, such as Fortran Coarrays and OpenSHMEM, as well as OpenMP across multiple memory domains and on heterogeneous nodes", stated Professor Barbara Chapman from the University of Houston and director of CACDS.

The UCX collaboration will be guided by a High-Performance Computing Leadership Team that includes: Dr. Arthur Bernard Maccabe, Division Director, Computer Science and Mathematics Division, Oak Ridge National Laboratory; Donald Becker, Tesla System Architect, NVIDIA; Dr. George Bosilca, Research Director at the Innovative Computing Laboratory, University of Tennessee; Richard Graham, Senior Solutions Architect, Mellanox Technologies; Dr. Sameer Kumar, Research Scientist, Deep Computing and High Performance Computing systems, IBM India Research Lab; Stephen Poole , CTO, Open Software System Solutions; Gilad Shainer, Vice President of Marketing, Mellanox Technologies; and Dr. Sameh Sharkawi, Team Lead, Parallel Environment MPI Middleware at IBM.

The UCX project at ORNL is funded by the United States Department of Defense and uses resources of the Extreme Scale Systems Center located at ORNL. This project is being developed using resources of the Oak Ridge Leadership Computing Facility at ORNL, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725.

For more information on the UCX collaboration, you can visit http://www.openucx.org .
Source: Mellanox