Back to Table of contents

Primeur weekly 2016-08-22

Special

ExaCT team shows how Legion S3D code is a tribute to co-design on the way to exascale supercomputing ...

Focus

Sunway TaihuLight's strengths and weaknesses highlighted by Jack Dongarra ...

Exascale supercomputing

Big PanDA tackles Big Data for physics and other future extreme scale scientific applications ...

Computer programming made easier ...

Quantum computing

Cryptographers from the Netherlands win 2016 Internet Defense Prize ...

Focus on Europe

STFC Daresbury Laboratory to host 2016 Hands-on Tutorial on CFD using open-source software Code_Saturne ...

Middleware

Germany joins ELIXIR ...

Columbus Collaboratory announces CognizeR, an Open Source R extension that accelerates data scientists' access to IBM Watson ...

Cycle Computing optimizes NASA tree count and climate impact research ...

GPU-accelerated computing made better with NVIDIA DCGM and PBS Professional ...

Hardware

Mellanox demonstrates accelerated NVMe over Fabrics at Intel Developers Forum ...

Nor-Tech has developed the first affordable supercomputers designed to be used in an office, rather than a data centre ...

NVIDIA CEO delivers world's first AI supercomputer in a box to OpenAI ...

AMD demonstrates breakthrough performance of next-generation Zen processor core ...

CAST and PLDA Group demonstrate x86-compliant high compression ratio GZIP acceleration on FPGA, accessible to non-FPGA experts using the QuickPlay software defined FPGA development tool ...

IBM Research - Almaden celebrates 30 years of innovation in Silicon Valley ...

Wiring reconfiguration saves millions for Trinity supercomputer ...

Cavium completes acquisition of QLogic ...

Applications

Soybean science blooms with supercomputers ...

NOAA launches America's first national water forecast model ...

Computers trounce pathologists in predicting lung cancer type, severity, researchers find ...

Star and planetary scientists get millions of hours on EU supercomputers ...

Bill Gropp named acting director of NCSA ...

Latest NERSC/Intel/Cray dungeon session yields impressive code speed-ups ...

User-friendly language for programming efficient simulations ...

New book presents how deep learning neural networks are designed ...

Liquid light switch could enable more powerful electronics ...

Energy Department to invest $16 million in computer design of materials ...

Pitt engineers receive grant to develop fast computational modelling for 3D printing ...

Environmental datasets help researchers double the number of microbial phyla known to be infected by viruses ...

Teaching machines to direct traffic through deep reinforcement learning ...

Simulations by PPPL physicists suggest that magnetic fields can calm plasma instabilities ...

New material discovery allows study of elusive Weyl fermion ...

New maths to predict dangerous hospital epidemics ...

Kx financial analytics technology tackles Big Data crop research at biotech leader Earlham Institute ...

The Cloud

New hacking technique imperceptibly changes memory virtual servers ...

Sunway TaihuLight's strengths and weaknesses highlighted by Jack Dongarra


20 Jun 2016 Frankfurt - At the ISC 2016 Conference in Frankfurt, Germany,Primeur Magazinehad the opportunity to talk with Jack Dongarra, Professor at the University of Tennessee in Knoxville, who is one of the authors of the TOP500 list. Jack Dongarra also has a position at Oak Ridge National Laboratory, which is very close to the university, and at the University of Manchester. He is a visiting scientist at Moscow State University and at the Texas A&M University. So, he has a number of different hats. Jack Dongarra has been involved in the TOP500 since it began, which is about 24 years ago. Hans Meuer had a list of the most powerful machines. They were ranked by peak performance. Jack Dongarra had a benchmark, which ranked machines by running this Linpack benchmark. Linpack solves this dense matrix problem. He studeid the new TOP500 number one system, the Sunway TaihuLight in depth.

Twenty-four years ago, it was a challenging thing to do, to really test the machines and show the performance, not only for that benchmark but matched with real applications. Over the years, the LINPACK benchmark perhaps is not as relevant as it was back then, but it is still used. There are a lot of historic data and the TOP500 is something Jack Dongarra believes will be around for a long time. There should also be other benchmarks that measure other things but Jack Dongarra didn't like to go into that discussion.

At the start of the ISC 2016 Conference, there was a very interesting announcement. The Sunway TaihuLight supercomputer, built in China, was christened as the new number one in the TOP500. Sunway is a unique computer in that it is based on Chinese parts. There is a Chinese processor. The Chinese fabricated the interconnect and put the machine together at the research facility in Wuxi. The chips were actually fabricated there. The machine is based on a many-core processor that has 260 cores in it. These are very lightweight cores. They have a cycle time of 1,45 Gigahertz. Each of these cores has a peak performance of about 11 Gigaflops. The chip has a performance of about 3 Teraflop/s. The chip has been put together in a certain configuration to fill up a cabinet. This cabinet has a peak performance of 3 Petaflop/s. The Chinese put together 40 of these cabinets to build out roughly 125 Petaflops. The theoretical peak performance is 125 Petaflop/s. They ran the benchmark, using all the cores this machine has. There are ten million cores in it. That is the aggregate core count for this machine. With that full machine, they ran the benchmark, coming in at 93 Petaflop/s. This is roughly 74 percent of the theoretical peak performance. This is a pretty impressive number, Jack Dongarra stated.

The interesting thing is that the power consumed during the running of the benchmark came in at roughly 15 Megawatts. This is brilliant for that benchmark. This translates into 6 Gigaflop/s per watt. This is a pretty high number and a very impressive efficiency, Jack Dongarra pointed out. Most of the machines we see in the TOP10 come in at about 2 Gigaflops per watt. This machine is three times more efficient in terms of power for that benchmark and is about three times more powerful than the previous number one machine, actually 2.7 times more powerful. That is quite a combination.

The machine also has some other interesting characteristics. It was used in some real applications. Those applications were written up and submitted to the Supercomputing Conference as Gordon Bell potential contenders. The Chinese actually submitted five papers. Three of those papers were chosen for Gordon Bell finalists. To put this in perspective, there are only six papers chosen at the SC event for Gordon Bell finalists, so they have half of these papers that are running for Gordon Bell. This proves that it is a very powerful machine with a very high impact. It is not just a stunt machine but it is a machine that could be used for real applications. Jack Dongarra doesn't know how much work went into developing those applications but these applications present a non-trivial implementation, whereas the benchmark perhaps is considered as a trivial implementation.

It is an impressive ecosystem. There are some deficiencies in the machine and those deficiencies come about when one starts to move large multi-data around. It is a good architecture for doing very dense matrix kind of computations, but if one starts to move information and new data through the memory hierarchy, one starts to see the weaknesses of the machine. The machine is using slow DDR3 memory. The network that they have overall provides for a very poor interconnect system. The other benchmark that Jack Dongarra has, is called the HPCG benchmark - a benchmark that does a lot of data movement. It implements the solving of a system of equations but this time using an iterative method where one has a sparse matrix that one is operating with. This benchmark shows an efficiency of 0,3 percent of peak performance. That is a very low number, compared to some of the other machines, Jack Dongarra explained. The other machines come in between 1 or 2 percent with the highest machine being the K computer, which comes in at a little over 4,5 percent at the theoretical peak.

So, this machine has a potential for certain kinds of problems to do very well, using all the cores in the system very efficiently. For other problems perhaps, however, which are more related to solving three-dimensional partial differential equations, it is going to be very hard to extract that performance from this architecture.

Primeur Magazinewanted to know which answer Europe, the USA and Japan would come up with in the next years for the TOP500 to counter the Chinese ambition.

Jack Dongarra said that in the USA three big machines are planned to come online in the 2018 time frame. They will be phased in, one in the beginning, one in the middle, and one towards the end of that time period. These machines will be roughly equal or greater than the peak performance for this machine. So, it is about a year and a half away. The Chinese certainly have a lead for measuring in terms of machines in the order of over 100 Petaflops performance, a lead of perhaps 18 months at this point. In Europe, Jack Dongarra is not exactly sure what the situation is in terms of big machines, so he could not comment on that. The Chinese machine has been in operation for a little while. There is some speculation that they are working on a follow-up machine. The following machine could take them close to half an Exaflop in terms of the performance, if not more. For the new number one, one will need to talk to the Chinese to know when this will happen.

This is one of three projects in China. This is the Wuxi project. There is a project going on at the NUDT, the National University of Defense Technology in Changsha to upgrade the machine that they have, which is the Tianhe-2. They are planning to replace all the Intel parts and replace them with other design parts which would take that machine over 100 Petaflop/s. There is rumour about a machine at the Chinese Academy which could come in with their own processor again to be in this race.

Ad Emmen

Back to Table of contents

Primeur weekly 2016-08-22

Special

ExaCT team shows how Legion S3D code is a tribute to co-design on the way to exascale supercomputing ...

Focus

Sunway TaihuLight's strengths and weaknesses highlighted by Jack Dongarra ...

Exascale supercomputing

Big PanDA tackles Big Data for physics and other future extreme scale scientific applications ...

Computer programming made easier ...

Quantum computing

Cryptographers from the Netherlands win 2016 Internet Defense Prize ...

Focus on Europe

STFC Daresbury Laboratory to host 2016 Hands-on Tutorial on CFD using open-source software Code_Saturne ...

Middleware

Germany joins ELIXIR ...

Columbus Collaboratory announces CognizeR, an Open Source R extension that accelerates data scientists' access to IBM Watson ...

Cycle Computing optimizes NASA tree count and climate impact research ...

GPU-accelerated computing made better with NVIDIA DCGM and PBS Professional ...

Hardware

Mellanox demonstrates accelerated NVMe over Fabrics at Intel Developers Forum ...

Nor-Tech has developed the first affordable supercomputers designed to be used in an office, rather than a data centre ...

NVIDIA CEO delivers world's first AI supercomputer in a box to OpenAI ...

AMD demonstrates breakthrough performance of next-generation Zen processor core ...

CAST and PLDA Group demonstrate x86-compliant high compression ratio GZIP acceleration on FPGA, accessible to non-FPGA experts using the QuickPlay software defined FPGA development tool ...

IBM Research - Almaden celebrates 30 years of innovation in Silicon Valley ...

Wiring reconfiguration saves millions for Trinity supercomputer ...

Cavium completes acquisition of QLogic ...

Applications

Soybean science blooms with supercomputers ...

NOAA launches America's first national water forecast model ...

Computers trounce pathologists in predicting lung cancer type, severity, researchers find ...

Star and planetary scientists get millions of hours on EU supercomputers ...

Bill Gropp named acting director of NCSA ...

Latest NERSC/Intel/Cray dungeon session yields impressive code speed-ups ...

User-friendly language for programming efficient simulations ...

New book presents how deep learning neural networks are designed ...

Liquid light switch could enable more powerful electronics ...

Energy Department to invest $16 million in computer design of materials ...

Pitt engineers receive grant to develop fast computational modelling for 3D printing ...

Environmental datasets help researchers double the number of microbial phyla known to be infected by viruses ...

Teaching machines to direct traffic through deep reinforcement learning ...

Simulations by PPPL physicists suggest that magnetic fields can calm plasma instabilities ...

New material discovery allows study of elusive Weyl fermion ...

New maths to predict dangerous hospital epidemics ...

Kx financial analytics technology tackles Big Data crop research at biotech leader Earlham Institute ...

The Cloud

New hacking technique imperceptibly changes memory virtual servers ...