CRESTA fits rather nicely among the DEEP, DEEP-ER and Mont-Blanc projects. The exascale problem for CRESTA is both a hardware and a software challenge. Based on current technology roadmaps it will be impossible to build something below 50MW, Michele Weiland explained. GPUs, Xeon Phi and traditional multi-core microprocessors cannot reach 20MW at this point in time. The solution is to rebalance the systems that we have by using simpler processors, simpler memory hierarchies and better communication links. The result is that you increase the amount of parallelism. If you scale up today's leading system, Tianhe-2, you end up with 92 million cores and 526MW at the exascale. However, if you slow down your system you get better balanced cores with parallelism at the 500 million to 1 billion thread scale.
This means that hardware is leaving software behind and developers do not know how to use such parallelism. The algorithms are not designed for this. Change is very difficult to instigate. Currently, we are at a tipping point in HPC, Michele Weiland told the audience. We cannot continue to just re-write the applications. A fundamental change in the algorithms is needed. It is also a problem at the level of petascale. Software and how we model and simulate remain the key challenges for exascale.
CRESTA is a two-strand project, looking both at systemware and co-design applications. Co-design applications provided guidance and feedback to the systemware development process. A cyclical process was going on with the application developers talking to the system developers and vice versa. For exascale, incremental changes are necessary but also disruptive solutions. With the incremental approach the team was looking to achieve maximum performance at exascale through optimisations, performance modelling and co-design application feedback. With the disruptive approach it was crucial for the team to understand the maximum performance by considering alternative algorithms before very major application redesigns were undertaken. This is particularly true for applications at the limit of scaling today. The CRESTA project has also been committed to open source for interfaces, standards and new software. There are many white papers and case studies available on the project's website.
However, getting people to think disruptively is very difficult, according to Michele Weiland. The major codes are often developed over long periods of time and disruptive change at their core is complex, costly and time consuming. The CRESTA team has shown that in some cases it is needed to invest the money. Two examples are IFS and OpenFOAM.
CRESTA has been criticised as not being proper co-design at times but you don't need to build hardware to use its parameters to inform the co-design process. The CRESTA team worked by assuming massive parallelism of more than 100 million threads, that were probably heterogeneous - using for example CPU and an accelerator, by assuming complex multi-layer memory and I/O hierarchies, and also non-uniform network topologies and performance. This work has also informed hardware design through CRESTA's HPC vendor partner Cray, but also other vendors who read or heard about CRESTA's work.
With regard to hardware developments, Michele Weiland said that we are now beginning to see how vendors are approaching the exascale. A key period will be 2017-2019 when a range of new technical solutions will hit the market. Some people claim that CRESTA is overestimating power thread counts by saying that it will be 500 million. A key challenge will consist in turning processor and node-level developments into systems at scale. There are few vendors today that can build reliable large systems. There is a real need for a greater engagement between applications, systemware and hardware designers.
At present, in 2015, the fastest machine in Europe is 6,3 Petaflop/s in Linpack and the fastest machine in the world is 33,9 Petaflop/s. This is way below the target in the EC call text that funded the DEEP and Mont-Blanc projects and that speaks of 100 Petaflop/s in 2014. We will only reach this at the beginning of 2017 perhaps. Exascale was predicted by 2020 in the call text but it has already been postponed to 2022-2024 by pushing this goal ahead of us. The only country with a published plan for 'close to' exascale targeting 2020 is Japan.
The three challenges for reaching exascale are power, parallelism and planning. The required power of around 50-60 MW will raise the annual electricity bill to about 70 million euro. As for the planning, software does not change overnight. It will be a large community effort with investments both in software and hardware.
CRESTA used several co-design applications: Elmfire, GROMACS, HemeLB, IFS, Nek5000 and OpenFOAM. Elmfire is a gyrokinetic code for turbulent fusion plasma to simulate plasma behaviour in large scale fusion reactors. The CRESTA team completely rewrote the code for exascale 3D decomposition and visualisation. GROMACS is a popular molecular dynamics package. The CRESTA team rearranged the code for task parallelism and ensemble engine. HemeLB is a code used for blood flow modelling to perform clinical planning. Here, the CRESTA team changed the physics for exascale. IFS is used for numerical weather prediction. The CRESTA team tackled the PGAS approaches, cubic grid and performed OmpSs experiments. Nek5000 and OpenFOAM both are open source computational fluid dynamcis codes. For Nek5000, the CRESTA team worked on the GPGPU engins and AMR, as well as on the exascale mesh partitioner. For OpenFOAM, the CRESTA team discovered that there is no hope to develop it into an exascale code.
CRESTA has realised a lot of scaling results but that was not the real aim of the project. In fact, CRESTA thought hard about how to enable and coordinate co-design within the project since this was crucial. Generally, work packages only encourage 1D collaboration with everyone working on their own goal, but the co-design in CRESTA was 2D and focused on specific well-defined challenges. One example is power measurement with tools used on the Cray XC30. The team used the Score-P plug-in interface to read energy and power counters and Vampir to visualise performance and counters. Another example, shown by Michele Weiland, is debugging. HemeLB crashed on 49,152 cores and the team received an error messsage saying "Terminated". University College London (UCL) and Allinea collaborated on finding out what the problem was using DDT. It turned out into a win-win situation because Allinea was able to work on a real problem to exercise DDT and UCL had the opportunity to fix HemeLB to go beyond the petascale. A third example CRESTA worked on was the co-location of tasks. A thermal radiation scheme in IFS is very expensive. The default configuration runs the scheme every forecast hour and on a coarser grid than the forecast model. ECMWF developed the radiation-in-parallel scheme and used hyper-threading to co-locate the model and radiation threads on the same core in strong collaboration with Cray. The approach was validated on the Cray XC30 and XK7 machines.
In the end, CRESTA has shown how software co-design can work, driven by a general understanding of the scale of parallelism that exascale hardware will deliver. The project has identified many challenges, not just with parallelism but also with I/O performance, tools, libraries, software and systemware. The team improved the tools and this has also benefited petascale. The project has given code owners the space to explore the exascale and to plan how to respond to it. A key success has been to create awareness of the challenges in order for management to properly plan and resource. The team has also shown that some codes like OpenFOAM will never run at the exascale in their present form.
The story of OpenFOAM has been a long one, Michele Weiland explained. The CRESTA team spent a lot of time in investigating the partitioning, communication and I/O. The team discovered that there are too many problems with the way the code is engineered. It is split into 'source' and 'applications' and uses heavily templated C++ with extreme use of operator overloading. In addition, it has a 'per process' I/O. Any new code has to follow the same abstraction and this perpetuates the problem.
A lot of new projects have spun off from CRESTA, including EPiGRAM, Grids in Grids, SimPhoNy, DASH, Introducing Thread and Instruction Parallelism into Ludwig, GROMEX, SkaSim, ELP, Score-E, COLOC, ExaFLOW, BioExcel, ComPat, INTERTWinE, and NEXTGenIO. Many of these projects are building on CRESTA activities or address areas that CRESTA had not the opportunity to look at. The recently announced FETHPC projects will take forward what CRESTA has learned about how to implement software co-design. One of the key challenges that CRESTA had identified was around exascale I/O.
The NEXTGenIO project will look into this in detail. Over the next couple of years, new technologies are coming to the data centre building on high performance NV-RAM technologies. EPCC is leading the 8 million euro project where Fujitsu and Intel will develop a new HPC platform using the latest technology. Software strategies for using NV-RAM in HPC systems will be explored. These strategies will be tested against real-world models of data centre I/O. These hardware developments have the potential to profoundly change the utilisation of HPC in the data centre into a more data-centric approach. All of this relies on understanding how best to expose the hardware to the software and vice versa.
The planned exascale technologies are such a profound change in HPC that developers also need to re-make the science case. There is a need to argue for software application but hardware funding as well because simply funding hardware projects that only stick together consumer technologies will not solve the exascale challenge. What is needed are application drivers and new mathematics. In essence, developers ask for big scientific challenges that cannot be solved on a hundred 10 petaflop systems.
CRESTA has moved some codes towards the exascale challenge and there should be more projects like that. It is clear that many instances of disruptive innovation will be needed to model and simulate on exascale systems. The mathematics community has a great responsibility in re-thinking many algorithms. In addition, there is a clear need for more software investment in order to build the type of software that the hardware can use, Michele Weiland concluded.