German HPC experts present Nationales Hochleistungsrechnen programme, HAWK and HoreKa systems, and Bavarian HPC roadmap at HPC Status Conference

19 Oct 2020 Almere - From its virtual 4m below sealevel studio in downtown AlmerePrimeur Magazinereported from the German 10thHPC Status Conference that was held online on 1 October 2020. Normally, this is the place for people in Germany to meet and discuss high-performance computing. We focused on a few conference highlights.

The event started with a keynote from Hans-Joachim Bungartz, hosted at TU Munich, who had some remarks on high-performance computing that were not all connected. However, they were all considered to be important. Since there were quite different remarks, he dubbed his presentation "Ein Kessel Buntes" meaning a kettle with all kinds of ingredients in it, named after a famous German television programme that ran for decades.

Firstly, Hans-Joachim Bungartz remarked that we have simulation on the one hand and Artificial Intelligence (AI) and Machine Learning (ML) on the other hand. Both need high-performance computing, not only computing itself but also numerical algorithms and work flows. On the one hand you have people who look at models and on the other hand you have data scientists looking at data but of course, they need to work together. They have different scientific methods to work from but in the end, they all need high-performance computing to run their jobs efficiently. Simulation and artificial intelligence go hand in hand: they are complementary and not orthogonal. One example that Hans-Joachim Bungartz mentioned is the call in the AI community to try to give a more mathematical foundation.

Another topic in the kettle involved algorithms which are the essence of everything. Without algorithms, you do not achieve performance. This was illustrated in yet another item that he took out of his kettle, namely co-design but done in the right way. Currently, a lot of people are working in a co-design fashion, taking applications and selecting hardware people to put them together and design new architectures. Hans-Joachim Bungartz warned as to not forget to also include algorithms, next to people and systems.

You have to have the whole range including algorithms because algorithms also have to change to be adapted to new architectures. Vice-versa, new architectures have to be adapted to be able to use the most efficient algorithms.

Next topic was the SPPEXA programme. This is the German programme for high-performance computing towards exascale focused on applications and software. This programme is already running for quite some time. There are a lot of results. In the SPPEXA programme, there are a number of research directions ranging from computational algorithms to software tools to application software to data management. Hans-Joachim Bungartz showed it was not limited to just one application area but there were applications and projects running for computer science, biology, and physics to name a few. The exascale software community is quite international, as well as the SPPEXA consortium and projects. They work together with people from Japan, France, and other countries.

Hans-Joachim Bungartz also looked into the future which will start soon for Germany when the "Nationales Hochleistungsrechnen" (NHR) programme will be adopted and implemented. NHR is a new way of organizing the supercomputing centres in Germany. The decision about how this will look like will be taken in November 2020. The work on the planning already started some time ago. Hans-Joachim Bungartz eagerly awaits the moment when the ideas about governance and operation that were submitted during the past years will become reality.

After the keynote there were several other presentations. One of these we want to highlight was about the Hawk installation, the supercomputer that is currently being built in Stuttgart. Professor Michael Resch explained that the Hawk system is delivered by HPE. It has been installed over the past period and we figure it will be completely ripe for production somewhere later this month if the plans do not change. Michael Resch told that there are a number of application areas. The Supercomputing Centre HLRS in Stuttgart is mainly focusing on engineering applications to support engineers. The research at HLRs is also focused on this. Apart, of course, from cybersecurity, high-performance computing is an important topic.

Michael Resch showed several examples of what they do at HLRS. In April-May 2020, they designed a digital twin from a small city called Herrenberg. With a digital twin, you can do a lot of research, such as simulating the traffic flow or visualizing all kinds of other flows and actions happening but also mapping social aspects such as the mental state of people living in the city, measuring how happy they are. The digital twin allows to analyze data in a virtual reality environment, in this case that of a small city. More information about the Herrenberg digital twin is available at the HLRS website.

Michael Resch continued by claiming that the HAWK system is the result of a very long process. As part of the process the HLRS people asked the vendors to submit a tender. The criteria for the tender included a sustained performance of 40 percent. The core benchmarks, such as HPL and HPCG that measure the performance for the TOP500 systems, are not that important, according to Michael Resch. It is only about eight percent of the total criterium. The total cost of ownership was 10 percent for the HAWK. Probably with the next system, it will be higher because of the increasing costs of energy and the awareness of energy efficiency.

The HAWK system is a 44 million euro machine. It has 720.896 AMD Rome cores. Initially, the idea was to have Intel cores but later on this idea was swapped for AMD Rome cores when they became available. The peak performance is 26 Petaflops. More important is that the sustained performance should be over two petaflops. There will be a considerable size of main memory and disk space.

Michael Resch also addressed the difficulties that they had in these difficult times because a big part of the machine was installed during the COVID-19 crisis. The experts from the United States who had to install the machine, could not fly over to Germany. Alternatively, the people from HLRS and HPE in Germany had to perform the installation of the machine which proved itself to be a challenge since guidance could only be given over the Internet. Someone in the conference audience asked why the vendors do not have people in Germany who can take care of the installation. According to Michael Resch this would be too expensive since one does not want to pay for the whole year for installation experts just waiting until they can perform their work. That would be a waste of time and money.

A third presentation was offered by Jennifer Buchmüller from the Karlsruhe Institute of Technology (KIT). KIT is currently installing the HoreKa supercomputing system. HoreKa will be ready for production in the second part of 2021. It is a 15 million euro system and at KIT they expect that it will end up somewhere in the European top 10 of fastest machines. The system consists of two main parts. One includes GPU accelerators, the other one is the main CPU part. The system is built in a modular way, in the sense that there is a section dedicated to high-throughput computing, and a section especially for accelerators. There is also a section which is reserved for future technology so they can introduce other technologies and connect them to the same hardware.

A fourth presentation was provided by Dr. Josef Weidendorfer. He explained about the Leibniz Rechenzentrum (LRZ) activities in Munich. About two years ago, the SuperMUC machine was installed which was just one part of a bigger programme. Bavaria, a state within Germany, has its own HPC programme. They invested hundreds of millions of euros into this programme for future computing. Josef Weidendorfer explained how one is preparing for the next systems. The aim is not to have the fastest systems but to have machines that best support the Bavarian researchers. The plan is to look at applications, isolating important parts of the application, try to figure out the best architectures and match these. This is a future computing plan which most of the other organisations also have. They are working on one hand together with the research communities in Bavaria and on the other hand they are working together with people in Germany and the European Union, the US and Japan.

If you want to know more about the 10thHPC Status Conference, you can go to the Gauss Alliance webpage where for several of the presentations the slides are available.

Ad Emmen