The entry level to the list moved up to 1,32 petaflops on the High Performance Linpack (HPL) benchmark, a small increase from 1,23 petaflops recorded in the June 2020 rankings. In a similar vein, the aggregate performance of all 500 systems grew from 2,22 exaflops in June to just 2,43 exaflops on the latest list. Likewise, average concurrency per system barely increased at all, growing from 145.363 cores six months ago to 145.465 cores in the current list.
There were, however, a few notable developments in the top 10, including two new systems, as well as a new highwater mark set by the top-ranked Fugaku supercomputer. Thanks to additional hardware, Fugaku grew its HPL performance to 442 petaflops, a modest increase from the 416 petaflops the system achieved when it debuted in June 2020. More significantly, Fugaku increased its performance on the new mixed precision HPC-AI benchmark to 2,0 exaflops, besting its 1,4 exaflops mark recorded six months ago. These represents the first benchmark measurements above one exaflop for any precision on any type of hardware.
Here is a brief rundown of current top 10 systems:
Other TOP500 highlights include the following:
A total of 149 systems on the list are using accelerator/co-processor technology, up from 146 six months ago. 136 of these use NVIDIA chips.
Intel continues to dominate in TOP500 processor share with over 90 percent of systems equipped with Xeon or Xeon Phi chips. Despite the recent rise of alternative processor architectures in high performance computing, AMD processors - including the Hygon chip - represent only 21 systems on the current list, along with ten Power-based systems and just five Arm-based systems. However, the number of systems with AMD-based processors doubled from what it was six months ago.
The breakdown in system interconnects is largely unchanged from recent lists, with Ethernet used in about half the systems (254), InfiniBand in about a third of systems (182), OmniPath in about one-tenth of systems (47), and Myrinet in one system; the remainder use custom interconnects (38) and proprietary networks (6). InfiniBand-connected systems continue to dominate in aggregate capacity with more than an exaflop of performance. Since Fugaku uses the proprietary Tofu D interconnect, the aggregate performance in the six proprietary networks systems (472,9 petaflops) is nearly equal to that of the 254 Ethernet-based systems (477,7 petaflops)
China continues to lead in system share with 212 machines on the list, handily beating out the US at with 113 systems and Japan with 34. However, despite the smaller number of systems, the US continues to lead the list in aggregate performance with 668,7 petaflops to China's 564,0 petaflops. Thanks mainly to the number one Fugaku system, Japan's aggregate performance of 593,7 petaflops edges out that of China.
The Green500 results include the following:
The most energy-efficient system on the Green500 is the new NVIDIA DGX SuperPOD in the US. It achieved 26,2 gigaflops/watt power-efficiency during its 2,4 HPL performance run and is listed at position 172 in the TOP500.
Next on the list is the previous Green500 champ, MN-3. Although it improved its score from 21,1 to 26,0 gigaflops/watt, it slips into the number two position. The system uses the MN-Core chip, an accelerator optimized for matrix arithmetic. It is ranked number 332 in the TOP500.
In the number three Green500 is the Atos-built JUWELS Booster Module installed at Forschungszentrum Jülich (FZJ) in Germany. It achieves 25,0 gigaflops/watt and is ranked seventh in the TOP500.
In fourth position is Spartan-2, another Atos-built machine. It achieves 24,3 gigaflops/watt on HPL and is ranked at position 148 on the TOP500 list.
The fifth-ranked system on the Green500 is Selene, with an efficiency of 24,0 gigaflops/watt. It also occupies the number five spot on the TOP500.
With the exception of the MN-3 system, the remaining top five Green500 systems are using the new NVIDIA A100 GPU as an accelerator. All four of these systems use AMD EPYC as their main CPU.
Of the top 40 systems on the Green500, 37 leverage accelerators, 2 use A64FX vector-processors, and one (TaihuLight) a Sunway many-core processor.
Extrapolating the power efficiency value of 26,2 gigaflops/watt of the NVIDIA DGX SuperPOD out linearly to an exaflop would result in a power consumption of 38 MW - ignoring additional hardware needed for scaling.
The HPCG results include the following:
The TOP500 list has incorporated the High-Performance Conjugate Gradient (HPCG) Benchmark results, which provides an alternative metric for assessing supercomputer performance and is meant to complement the HPL measurement.
The list-leading Fugaku expanded its HPCG result with a record 16,0 HPCG-petaflops. The two US Department of Energy systems, Summit at ORNL and Sierra at LLNL, are second and third, respectively, on the HPCG benchmark. Summit achieved 2,93 HPCG-petaflops and Sierra 1.80 HPCG-petaflops. The only other systems to break the petaflops barrier on HPCG are the upgraded Selene system at 1,62 petaflops and the new JUWELS Booster Module at 1,28 petaflops.
The HPL-AI results include the following:
The HPL-AI benchmark seeks to highlight the convergence of HPC and artificial intelligence (AI) workloads based on machine learning and deep learning by solving a system of linear equations using novel, mixed-precision algorithms that exploit modern hardware.
The top-ranked system for this benchmark is RIKEN's Fugaku system, which achieved 2,0 exaflops of mixed precision computation. At number two is ORNL's Summit supercomputer, which achieved 0,55 exaflops, followed by NVIDIA's Selene which turned in an HPC-AI result of 0,25 exaflops.