One of the reasonable approaches is to look at low power CPUs, particularly look at the ARM space and what is coming from the embedded mobile market. In addition, accelerators is probably a way to go. This contributes to higher throughput and higher energy efficiency.
The next thing that the project teams looked into is efficient cooling and high frequency sensors. Why would you want energy-efficient cooling, Axel Auweter asked. Energy-efficient cooling means you are improving the ability to remove the waste heat that you are producing with the machines from your hot components. If you do so, you can basically pack things tighter, you can improve density, you can improve performance which is all good for exascale.
If you improve the heat removal, you can actually run at higher inlet temperatures. You can then use year-round chiller-less cooling. All the data centres currently have large energy-hungry chillers that need to produce cold to get rid of the heat. If you can get rid of those chillers, you are already saving energy and operational costs, Axel Auweter explained.
Another advantage of having higher temperatures is that you can make better use of the waste heat in winter. There are some experiments where the teams are actually investigating heat reuse for cooling.
Ideally, you can perform single phase liquid cooling by using water. You can also perform two phase liquid cooling by using 3M Novec for instance like in the bubble bads. In the DEEP-ER project, for three of the big machines - the cluster, the booster and the small energy efficiency evaluator - the team is using aluminum coldplate capable of cooling all the components on the board with water. For the little ASIC evaluator, the team needs a quicker solution because it came quite late in the project. Here, the team uses the 3M Novec technology. This, however, is not suited for a 10 years operation lifetime.
Axel Auweter went on to talk about sensors. He showed how the team performed power monitoring in the MontBlanc project. MontBlanc has a blade architecture with 15 boards. These are interconnected through a midline. For each of these node cards the team has a power monitoring chip fully integrated that measures the power and supplies the power measuring data over digital I-square C-bus.
The team uses that high frequency data and aggregate it in an FPGA that is on the board to do some averaging. Next, the team basically forwards one second average values for each of these nodes into a board management controller. The board management controller will get back to the what is the resource of this power monitoring data.
Axel Auweter then switched over to the power consumption monitoring in the DEEP project. He showed a brief overview of the DEEP Booster node card design. The node cards have some sensors on their own. There is a board management controller which is capable of collecting all the sensors that report information of the sensors that have been put on this custom-redesigned boards.
All in all, each of these prototypes has way over 1000 sensors. The team also wanted to have additional information from sensors that are outside the system, sensors that originate from the infrastructure for cooling around these systems, integrated into the global monitoring system. Ideally, you would like to have these sensors report very high frequencies.
Typically, the standard system gives you a reading once per minute because a standard pool model is used. You have a management server that queries one sensor after another. The team needed something at higher frequencies. So ideally this requires some push model for that, explained Axel Auweter.
All this information is aggregated in this way. Both projects have their board management controllers, doing the very low level sensor acquisition. The team also created some kind of software connectors for integrating monitoring data that the team gets from elsewhere, for example IPMI protocols, a standard management protocol for servers. SNMP is another standard management protocol the team is supporting. You will also have a software component to aggregate for the information that you get from log files.
The team is using a message protocol, originated from the Internet of Things space, that is quite useful for their purpose. The team basically feeds all the information from those various sources at high frequency into a component that is called the CollectAgent from where it is being stored into a database, explained Axel Auweter.
Scalability is of course a concern. The whole system is actually designed in a fashion that you can duplicate or add database nodes and CollectAgent instances which enable you to linearly scale this entire set-up, meaning that if you are running low on the performance of this database, you can add more of the database servers for your monitoring.
Since the team uses a distributed key-value store, this still is visible to the user as if it was one single big database although it is stored in a distributed fashion. The team can implement a system where it stores the monitoring data locally, close to the centres where it is being acquired and nonetheless run global analysis on this data.
Axel Auweter concluded by giving one of the examples from the MontBlanc project: the power trace comparison of standard MPI and tuned OmpSs+OpenCL versions of the Himeno benchmark. At first the application run time was almost a hundred and thirty seconds. Afterwards the application finished much faster with a much shorter development and optimization cycle.
The solution the teams developed for both projects does scale, so when it comes to exascale more sensors will not provide more problems.