The Modular Supercomputer Architecture (MSA) is an innovative approach to build High-Performance Computing (HPC) and High-Performance Data Analytics (HPDA) systems by coupling various compute modules, following a building-block principle. Each module is tailored to the needs of a specific group of applications, and all modules together behave as a single machine. This is ensured by connecting them through a high-speed network federation and operating them with a uniform system software and programming environment. This allows one application or workflow to be distributed over several modules, running each part of its code onto the best suited hardware module.
Creating a modular supercomputer that best fits the requirements of the diverse, increasingly complex, and newly emerging applications is the objective of DEEP-EST, an EU project launched on July 1, 2017, lead and coordinated by the Jülich Supercomputing Centre (JSC). The DEEP-EST project builds a prototype with three compute modules: the Cluster Module (CM), the Extreme Scale Booster (ESB), and the Data Analytics Module (DAM). The CM is a general-purpose cluster and targets low/medium scalable applications, while the ESB is built as a cluster of accelerators to provide energy-efficient computing power to high scalable codes. Last, but not least, the DAM addresses the specific needs of Machine/Deep Learning, Artificial Intelligence and Big Data applications and workloads.
Amongst the sixteen partners in the DEEP-EST project, MEGWARE is the system manufacturer and integrator of the DEEP-EST MSA prototype system. Recently, MEGWARE has installed the first module, the Cluster Module (CM), at the Jülich Supercomputing Centre. The two remaining compute modules (ESB and DAM) will follow by the end of this year.
The CM is designed to support the full range of general-purpose HPC cluster applications and workloads. For efficiency, its integration is done using MEGWARE's ColdCon direct liquid (hot-water) cooling and SlideSX-LC packaging technologies. "Based on several years of intensive experience in energy efficient computing, MEGWARE's award winning direct liquid cooling solution represents a leading European HPC technology that is scalable and sustainable. Energy efficiency is one of several critical design points for future Exascale supercomputer solutions", stated Dr. Herbert Cornelius, Principal System Architect at MEGWARE.
The CM consist of a single rack with 50 Intel Xeon Scalable Processor-based dual-socket nodes with a Mellanox EDR-InfiniBand 100Gbps high-performance cluster fabric.
The Cluster Module is the first step in the installation of the DEEP-EST prototype, and an important milestone in JSC's strategy around the Modular Supercomputing Architecture. "We see today how our users increasingly combine different simulation models to reproduce complex phenomena. They also employ both HPC and Data Analytics approaches. This diversifies our user-portfolio enormously, making it hardly possible to fulfill all needs with one supercomputer", stated Prof. Thomas Lippert, director of the Jülich Supercomputing Centre. He added: "The DEEP-EST prototype will demonstrate that a Modular Supercomputer is much more flexible than a monolithic one, and matches very diverse application profiles in a cost-effective way."