Primeur magazine: If we try to move a little bit more to the Exascale topic. Last year and the year before, there was a lot of excitement about Exascale outside the community, that it would solve the world's problems. Inside the community there was a lot of collaboration with joint projects and trying to figure out what to do, what is needed, and what to strive for, for instance 20 mWatt as a target seems to have its own life. But during the past year, you did not hear so much anymore. I assume that this is because people are now working on achieving the goals instead of discussing and arguing?
Thomas Sterling: You say you do not hear much? I hear an enormous amount from the people directly working on it. This has been the most expansive year in so many ways in exascale. And internationally, Japan has a good planning process taking place.
Satoshi Matsuoka: Well,..
Thomas Sterling: From the rest of the world's point of view. Europe: same thing.
Primeur magazine: Well...
Thomas Sterling: But it is true! In the US, we have several programmes in place, being worked on, purpose of which is exascale. And also in China they are working towards that goal in exascale. I am heavily involved in three separate funded projects that are of different agencies that are targeting exascale. The most important one started just in September. It is the department of energy, it is the X-Stack programme, and our project is XPRESSS, led by Sandia National Laboratory involving about 8 institutions. That is just one of several projects under that programme.
So I believe, and I hope Satoshi agrees with me, we are not building towards exascale programmes, we have exascale programmes. They are not funded enough, at least in the US, but they cross applications, system software and to a less extent, system architecture and hardware technology; although those are ramping up. So from my view point, watching others and participating, I believe there is global activity towards exascale in a way we never saw towards petaflop/s.
Satoshi Matsuoka: I agree. To me what I see this year, was pretty exciting. Exascale is now becoming a reality. It is becoming a formidable target that can be implemented with diligence and various technology advances. Last year there were a lot of issues that sprang up: Is it programmable? What about the energy consumption? Will we get money? What is the use? What is the utility of exascale? And all that. But now this year, it is not that we solved all the problems, however, we made significant progress in these exascale projects to the extent that there are exascale progress reports and these seem to be entirely within some statistical deviation for the projected years.
Now, the other excitement that I see is, at least in Japan, and probably in Europe and in the US, that there is public support to invest in this future generation of supercomputers. The committee that I am involved in making recommendations for exascale projects, did ratify a new report on May 8th. We thought at that time that it was just a small interim report from a small government committee: It would have some significance, but there would be a big battle to approve this. But somehow it leaked. I will not go into details, but what happened was it made the front-page of the Nikkei newspaper: "Japan - we go exascale". In fact the report itself does not really say that, rather it says, in essence, "well maybe we ought to do exascale because other people are doing it", but the newspaper headline said: "Japan will do exascale in 2020 and reclaim number one status in the world."
Thomas Sterling: Not just in Japan, also in the US.
Satoshi Matsuoka: And in other news papers too. It became a big media event.
But then the report did come out and there was broad public support for developing these kinds of technologies. So I believe the excitement is there. Getting public support is, of course, very important although there are a lot of uncertainties. So this has been the year when exascale made a definite step forward internationally in being recognised as a working programme heading towards a realistic goal.
Primeur magazine: And of course you also had in Europe the commissioner Kroes who said that HPC and, supercomputing is important, which means for them it is really important, otherwise they do not do statements like that.
Thomas Sterling: In the US, exascale is a runner up at the highest level of politics. From the Presidents office, and dealing with congress. They not necessarily all support it, but it is visible in the number one people in the OMB, also the response to the Office of Science and Technology Programmes - OSTP. So the Department of Energy and the other agencies are all working all the way up the hierarchy. Exascale is upfront. Although there are different views.
Second thing being anticipated is that there are going to be two waves in the exascale era. It is anticipated that China and Japan, and possibly Europe will build an exaflop/s machine first. They will go after this because there is stature in this, and they will push what we have been doing since the Petaflop/s era into a form. The second wave will be that they will build a real exascale machine, which does not operate like the first exaflop/s machines, but they will recognise there are different dynamic modalities that have to be supported. One simple example would be that the processor cores, whatever they look like, today, and I expect the first exaflop/s machines will treat the majority of the other cores in the system as I/O devices. What eventually will happen is they will treat the rest of the system in a name space. This will be a vast improvement. It will improve efficiencies. But it will also give new challenges and will also allow many applications which are falling off the Moore's law curve, and for which the curve has become irrelevant, it will bring them back in to high-end computing.
So we are going to have these two waves. I personally would like to see that this adolescence of LINPACK flop/s exascale get out of the world wide mind set, and to think about, in the end, things that can be done only with good exascale machines. So our leaders in the US, I cannot speak for the other countries, are now recognising, this. And it sounds almost like the US is too good for that, which is not true. The US, and especially NNSA who is also responsible for keeping us where we can do the best science, no matter how difficult, we will probably produce such a machine, as well, in the 2019 - 2020 time frame. So guesses are, some people disagree, that the second generation of exascale machines will be somewhere around 2023 plus or minus a year.
Primeur magazine: Another topic discussed at ISC'13 was the resilience of these big machines. Because of their size, they will fail each hour, hour after hour, day after day. It seems like you, Satoshi, had a suggestion to say "yes, but it is just normal engineering practices to handle that".
Satoshi Matsuoka: It is both engineering and science. This year we had a keynote by Bill Dally at ISC'13 here in Leipzig who stated: (for exascale) these are the projected performance gains; these are the projected energy budgets, etc. Since you cannot go over your energy budget, you have to design the system to be much smarter, in that c we need to work not just on the semiconductor process, but also on the circuits and design, as well as architecture to regain the speedup momentum as we have experienced in the past. In this respect it is the same for resilience. There are ways in engineering methodologies which we can tap into in such a way we can gain better resilience.
We can design machines that have less components, less moving parts, and there are various ways and innovations in doing that. There are also various software layers to make the system much more autonomous in failure detection and correction. The system size will nevertheless grow, so we will have to have some scientific research as well where we will have smarter algorithms, better system architecture, and much better theory of resilience of these machines, such that we can make additional improvements. But all these combined, the engineering and the science, I do not see tremendous problems for exascale resilience. When we extrapolate from the architectures we have today I do not think the machine will be 100 times bigger. Machines might be several times bigger, perhaps 10 times bigger, but not orders of magnitude bigger. In that way, by using both engineering and science we can deal with resilience for systems of that size.
Primeur magazine: And from the application point of view? Dally said parallel programming is easy.
Thomas Sterling: I commented on that: parallel programming is easy, good parallel programming is hard. Especially when you go beyond signal processing. If you try some though adaptive mesh refinement or real - not academic - particle codes achieving optimal performance is very challenging. . Parallel programming in conventional practices is hard, because the programmer takes on almost all the burden of determining almost all of the parallelism, and all the placement and scheduling of that. There are small improvements taking place, but it is a classic rule of thumb it takes 10% of the time to get the right answers and in parallel and then it takes 90% of the time to get to the performance that you think is near to what the algorithm can do. Let me take a slightly different view. Noting this is a very controversial subject. However the problem has to do with recovery.
Conventional check point restart works fine when you do the equation on how much overhead is involved in how much time is lost, and what is the availability. As long as the numbers come out right, this is fine. But all the trends suggest that because of the growing size of the memory capacity and changes in the band-width in and out, this becomes less and less tractable. I agree completely that the physical size of the systems are not likely to grow substantially, say no more than a factor of four, and the reason is cost of space and the power. But what will improve, and I think you will agree with me, Satoshi, is the density of the systems. Whatever we think is a part today, will have more and more parts. For instance stacked dies that will go up to - and I hope I do not sound naive 10 years from now, 16 dies in just one stack. Now we have a lot more parts in what used to be a planar representation: there is this extra half dimension that goes in.
The parallelism, the number of actions that go on will be many more and so there will be more opportunities for single point failures when there is not an adequate recovery. We do not detect faults, we detect errors and then we have to diagnose the fault, whether there is a hardware fault in which case the system has to reconfigure, or a soft fault in which case we have to simply correct for it. But you do not want to stop the entire billion cores; try to bring something like that to a halt is almost impossible.
So my conclusion is that the execution model has to absorb the notion that the computation will continue in the presence of faults, in the presence of errors, and will be able to self correct and go forward. This adds the uncertainty of asynchrony to which the computation will have to adapt dynamically. Mechanisms at all of the levels including application, runtime, operating system, and architecture will all have to work together in concert to provide a fault-tolerant model within the total execution model as well as how they are manifest in the implementation of all of those parts. Therefore, I believe that there will have to be an active dynamic approach for exascale systems to fault tolerance through adaptive fault management.
The interview is published in four parts: