Satoshi Matsuoka:For next year there will be a whole new breed of architectures coming. The TOP500 stagnated this year, but because of the new architectures that are arriving, we may actually start seeing some new machines on the list, showing progress, because again there are these technological bumps that will certainly surface. That will be very visible in these metrics for example. In the long run, as Thomas mentioned, Moore's Law will be ending and there is no alternative. People are working on various solutions, but undoubtedly, we cannot continue on this technology path. There will be occasional bumps here and there, when we devise new technologies, but often times these are single type bumps. The last time a single time bump led to many-core.
Thomas Sterling:In 2004-2005. That was ten years ago.
Satoshi Matsuoka:Actually there were two bumps. One leading to multi-core and one leading to real many-core like GPUs, but again these were bumps. 3D packaging will be another bump, and maybe non-volatile memory. Again, there will be concerns. Of course, we cannot continue on this trajectory. My concern is exascale. Although there are efforts to make the machines efficient, like I mentioned initially, and of course, software to exploit the trajectory or the perceived trajectory of where the machines are going. Although they might be different, but still a certain trajectory, this will be the right path for the future. In particular there has been ongoing discussion of when Moore's Law ends. We may not be able to compute at an increasing rate. That is, we cannot make faster chips, because the extreme case of Moore's Law ending is that the transistor power will become constant. The reason why we have been able to arrive at Moore's Law is because the power continuously shrank in an exponential fashion. That is how the machine performance increased. If the transistor power becomes constant, no matter what we do, we cannot build something faster by shrinking the transistor. We have to think in a completely different way. New architectures, new technology, different types of governing parameters than transistor counts. When that happens, I have some serious doubts whether the trajectory will be progressing in terms of, for example, increased computational intensity. Increasing computational intensity is made on the assumption that we will keep on increasing the computational capabilities. If that becomes constant, while the other parameters may still grow, that may be the wrong approach. Just as when we made the transition from the vectors to cache-oriented processors, there may be some game changes that may be looming along the way. Some of these changes may be completely revolutionary, and it is mandatory such that maybe the investments we are making now, may not be the right investments. As researchers, we now really need to think about what happens when Moore's Law ends. What would be the right thing to do at that time, rather than saying, we are going to exascale machines. That is as if we know what the exascale machines will look like, we really need to think about the next phase, for Zettascale. Up to exascale we can ride on Moore's Law, but a short time thereafter, Moore's Law ends.
Thomas Sterling:I agree with everything. I do not have to add anything. So let's tweak a little bit, because there is nothing with which I take exception. That was a pretty comprehensive representation.
Looking at a year from now, a year to a year-and-a-half we can be clear about a few things. It is clear that we will be entering but not have entered yet, the 100 Petaflops era. We will be preparing for it. There will be test platforms. People will be developing codes for a 100 to 300 Petaflops machines planned to be deployed. We can expect that there will be a change with the Xeon Phi, which I think is a canonical lightweight architecture put into the memory hierarchy. The term self-hosting is one way to describe the expected change as it becomes the processing element. Regular Xeon will only exist for continuity of code. I think this is going to be significant, where we have many more cores suddenly in the memory address space. There are three questions. Current research in Japan and in the US and to a lesser extent in Europe are involved in it. The Japanese and the Americans, either together or separately are working on these. These are: What is the operating system going to be for the biggest machines? Second: Are we going to use runtime systems as the significant management of application parallel workloads? Third: What are - and I say "are" in plural - the programming interfaces going to be like? There is going to be important work over this coming year that will provide useful information in helping us to understand these issues better. I do not think we will know the answer when we are sitting here next year. But we will know a lot more, and we are going to have a lot more foundational experience base there.
I think you are going to continue to see the trend in the near term with Cray taking a somewhat increased part of the total performance, HP increasing somewhat in percentage of the deployed rate, and the long tail of IBM systems will have waned. So it will be an HP and Cray dominated arena. I think the Cray XT30 and XT40 are fairly nicely balanced machines.
Long term, again I think, Satoshi Matsuoka pretty closely nailed it. In particular, let me reinforce his point, which is, when Moore's Law is truly at sunset and the gap with the asymptote is vanishes, when it is finally to its resting place, then it is architecture which is our only hope. Where we try very hard to pretend we are not going to change the architecture, I think there will be resurgence. But before that, in preparation, we are going to find a much closer relationship which we are already seeing between logic and memory. We will see PIM-like work going forward by Micron and other component vendors.
Satoshi Matsuoka:They will. The question is whether it will be an American company or a Chinese company.
Thomas Sterling:Right now, Micron's innovative work in the area of tightly more closely emerging logic and memory is very important. Intel also has other approaches to reducing these latencies and the power for these accesses. These will already start happening. I also think that my final statement would be a better understanding of the relationship between application demands and architecture. That would be co-design in the large. Not just co-design in the small.
Satoshi Matsuoka:But there is no choice. That is the only way. Well, there are several ways, architecturally to increase performance. One is, as Thomas said, to use memory as the source of speed-up. There will be continuous capacity and bandwidth increase as technology evolves, even beyond Moore's Law, because of the new devices. That is one thing. But then the algorithms have to change to utilize these resources. And then, of course, architecturally, you have to be much, much more specialized, that is, specialized in data, or specialized in workflow.
Thomas Sterling:Also specialized to the control overheads. If I were to wave my magic wand in architecture, it would be, among other things, focused on those overheads needed to manage all aspects of global computation, not merely within the core or within the node.
Satoshi Matsuoka:It is very likely that the cores of these processors will be much more an aggregration of specialized processors, as it is much with SoCs today, and like video and codec on cell phones: hardware specialized for video codec. Or that you have some flexible hardware programming, and this is where FPGAs come in. Again, this will bring about more complexity, because there is so much architectural diversity depending on the data, the workflow, and so on.
Thomas Sterling:But it will be spawned out of necessity. It will be almost a renaissance, which we sorely need, but we always try to find the lowest energy path and think away from significant architecture changes in that path.
Satoshi Matsuoka:There will be no choice. In some sense, as Thomas puts it, a renaissance, more memory centric computing, or renaissance of architectures, sure innovations, renaissance in terms of programming models to drive these heterogeneous architectures and also very different types of optimization points including trade-offs between the parallel algorithms. In ten years, when Moore's Law starts descending, we expect to see tremendous resurgence, and HPC already being a leader in terms of innovation and IT. We see this field to actually flourish with a wealth of new ideas, because we cannot have a free ride anymore.
Thomas Sterling:Allow me to make my final comment, and that is looking forward, and driven by Satoshi Matsuoka's notion of necessity. We may not in fact be talking processors or cores ten years from now, because however plenty we have and many diverse ones, it still comes down to decision to optimize the ALUs of these processor cores. We are smarter now, and we recognize the importance of memory. Processor cores are, as I said in my talk yesterday, of least importance or relatively small importance. Control state on the other hand, and the instruction issue rates are very, very important. We may have a different balance of what we used to think was all inside the core, what we called the micro-architecture, that itself may come apart, decompose and find a different balance.
Satoshi Matsuoka:So bright a future, so we have to stay healthy.
Thomas Sterling:We have work to do, so we have to stay healthy.
Primeur magazine:Thanks very much for this interview.
This article is part of a longer interview. The complete interview is divided in 6 articles: