With this delivery, the DEEP consortium can leverage a supercomputer with a peak performance of 505 TFlop/s and an efficiency of over 3 GFlop/s per Watt. The Eurotech hot water cooling solution allows for additional permanent gains in energy efficiency at data centre level as it guarantees year-round free cooling in all climate zones. The system includes a matching innovative software stack, and six carefully selected grand challenge simulation applications have been optimized to show the full performance potential of the system.
This Cluster-Booster architecture guarantees maximum flexibility and scalability with very high energy efficiency. The 3D Booster interconnect can be scaled up arbitrarily, and thus the DEEP system is a sound base for extrapolating to Exascale performance levels.
"DEEP features a unique and novel computing and programming concept", stated professor Thomas Lippert from Forschungszentrum Jülich. "Its Cluster-Booster architecture is optimized for problems with challenging computational complexities, highest data integration demands and unlimited scalability at the same time. DEEP is designed to open the door to extreme scale computing for a much wider range of scientifically, economically and societally relevant application fields than any other architecture could achieve before."
Thanks to its architecture, the DEEP system dynamically matches the characteristics of application sections and the compute resources they run on. Highly scalable code parts with regular compute and communication patterns run on the Booster manycore coprocessors with high SIMD performance, while application sections with limited scalability or irregular patterns that require high single-thread performance execute on a general purpose multi-core Cluster. Both parts are connected with a high-speed, zero-copy network bridge, and arbitrary numbers of Booster and Cluster processors can be combined to best run an application.
In late 2012, Eurotech delivered the Cluster part of the DEEP supercomputer, an Aurora Tigon Cluster with 128 Intel Xeon nodes and an InfiniBand interconnect.
The recent installation extended the DEEP machine with the Booster, an innovative, highly scalable system. The Booster is a 384-node system interconnected via a 3D torus, directly switched by the interconnect of EXTOLL. Each Booster node has one Intel Xeon Phi coprocessor, connected via PCI Express to an EXTOLL NIC, which enables the 3D Torus network. Each blade is made-up of 2 nodes assembled together and cooled with the Eurotech Direct Hot Water Cooling technology. This guarantees hot pluggability, uniform cooling of all components and high energy efficiency at the system and data centre level.
The Cluster and the Booster interconnect networks - fat-tree InfiniBand and EXTOLL direct-switched 3D torus - are bridged by Eurotech-designed Booster Interface nodes running a DEEP-designed bridging protocol. They use low power Intel Xeon CPUs which boot the Intel Xeon Phi coprocessors and coordinate the data flow between both sides of the DEEP system.
Since energy efficiency is universally perceived as a key challenge to reach Exascale, DEEP put this topic in the project focus. To this end, Eurotech installed a second DEEP prototype system - the DEEP Energy Efficiency Evaluator consisting of 8 Booster and 4 Cluster nodes - at project partner Leibniz Supercomputing Centre of the Bavarian Academy of Sciences and Humanities. This system is dedicated to optimize the power consumption of the DEEP machine and test an innovative, scalable and fine-grained monitoring system developed within the project.
"The completion of the DEEP Booster delivery is a key milestone in the development of the novel 'Cluster-Booster' concept developed in DEEP", stated Fabio Gallo managing director HPC at Eurotech. "It is going to be further enhanced in DEEP-ER. We believe this architectural innovation for extreme scale HPC systems is a key enabler for future European Exascale projects."
While DEEP enters production, Eurotech and the DEEP partners are already engaged in the follow-up project DEEP-ER. With this project, the consortium takes the Cluster-Booster concept of DEEP to the next level. The DEEP-ER booster uses the second generation Intel Xeon Phi manycore CPUs, allowing the team to bring additional innovation and more flexibility to the DEEP supercomputing architecture.
The DEEP prototype system will remain in use at Jülich Supercomputing Centre at least for the next two years and will be also made available to application developers outside the project. To best exploit this innovative hardware architecture the project has developed a standards-based software stack for application developers that emphasizes ease-of-use. It features a fully standard compliant MPI-2 implementation to facilitate straightforward porting of applications, and extends the OmpSs task-based programming model by scalable offload functionality to simplify the subdivision of applications into parts that run on the Cluster vs. the Booster.
The system is designed as a general-purpose HPC machine. It is especially interesting for HPC applications that combine parts with different scalability characteristics as the DEEP software stack enables dynamic offloading of the highly-scalable parts to the Booster, whereas low to medium scalable code parts run on the Cluster.