DEEP - advanced high performance computing with Knights Corner booster

20 Jun 2012 Hamburg - Ulrich Brüning from the University of Heidelberg presented the European Union-funded DEEP project. The goal is to combine two types of systems so that lowly scalable machines send data to the booster of highly scalable systems. In this regard DEEP will develop a scalable interconnect for a booster with Knights Corner processors.

Ulrich Brüning started his talk with Moore's Law which says that the number of transistors per area doubles in 1,5 years. This law holds since 40 years but it is based on silicon technology. However, Meuer's Law states that the supercomputer performance increases by factor 1000 in 10 years.

In order to respond to this fact, the DEEP project will combine two types of machines: constellation systems and graphic-card accelerated cluster systems which are low to medium scalable architectures with highly scalable architectures, such as IBM Blud Gene/L, IBM Blud Gene/P, and IBM Blue Gene/Q.

Ulrich Brüning expanded on this Cluster - Booster concept. Intel MIC is being used on the EXTOLL network.

The power budget is under contol. We can have 10 times more processors at today's power budget. The decreasing number of codes is scaling and going to Exascale. The extreme component count and need for power efficiency reduces the reliability of systems. Who will choose for specialized HPC technology? Do we create a market? Ulrich Brüning asked the audience. The processrs as most expensive component should be off-the-shelf.

The DEEP project aims at using many core technology and hot cooling. The partners are investigating hierarchical scalability with the concept DEEP. They want to use the right hardware for different application scalability levels and expect between 5 and 10 times more cores at an equal power envelope. The resource management is guided by the monitoring of hardware sensors. The DEEP partners will exploit innovative processors and create an interconnect.

Cluster with accelerators are being envisaged. The accelerator needs a host CPU with static assignment. The communication so far has been established via main memory. The PCIe bus turns out to be a bottleneck and requires explicit GPU programming, explained Ulrich Brüning.

If we look at a cluster of accelerators only, the node consists of an accelerator directly connected to the network. The large code parts with regular communication patterns are off-loaded.

A stand-alone booster works with an energy-efficient cooling system with hot water. The new programming paradigm is OmpSs from Barcelona.

DEEP is working on the development of a prototype hardware platform. It is a combination of innovative technologies for the booster and aims at the improvement of current cluster energy-efficiency and the use of advanced software-aided cooling technologies.

Knights Corner is the first MIC commercial product. The EXTOLL network will interconnect all Knights Corners. This guarantees low latency and high bandwidth. DEEP will also feature a VELO communication engine, an RMA engine for remote memory access and bulk data transfer, and an SMFU engine for bridging to InfiniBand.

Many status and control registers will be included for access from the host.

Ulrich Brüning explained that there will be a holistic optimization of communication path and functions. The integration of all functions is required for communication, as well as the virtualization of a network device for many cores. Other features include the optimization of all protocol layers and a stateless NIC or minimize state.

The virtualization of a network device means the introduction of a unique identifier VPID. Process separation is required for secure communication.

NIC features involve a VELO Very Fast two sided messaging engine and RMA optimized access to the remote memory.

The network consists of 64k nodes, hundreds of endpoints per node, an efficient network protocol, support for arbitrary direct topologies. There is a choice for implementation from 6 links. The natural topology is a 3D torus allowing adaptive and deterministic routing. The packets routed deterministically are delivered in order. There is a total of three virtual channels and four independent traffic classes. Ulrich Brüning also mentioned the support for remote configuration and monitoring of EXTOLL nodes without host interaction.

The software architecture is optimized for HPC users and has an OS bypass. The Linux kernel drivers manage the resources. There are low-level API libraries and MPI support.

The network integration is achieved via two full EXTOLL NIC's per Booster Node Card (BNC). It is possible to connect to the KNC board via PCIe. The EXTOLL implementation is currently available as FPGA implementation. The transition to ASIC for the final BNC is expected in the second half of 2012. Ulrich Brüning said there will be a gain of factor 2 for latency.

The Booster Node Card (BNC) consists of two torus nodes per card with one dimension on the node board.

The network also involves a Booster Interface Card (BIC), a Sandy Bridge generation CPU and chipset, InfiniBand HCA from Mellanox, EXTOLL NIC, and a PCIe switch that enables peer-to-peer communication.

The BNC Subsystem design has a booster interconnect evaluator, including concept validation.

There will be end-to-end signal simulations to validate the design and component choices. The interconnect between the DEEP Cluster and the DEEP Booster will happen via the Booster interface, concluded Ultrich Brüning.

More information is available at the DEEP project website.

Leslie Versweyveld