Back to Table of contents

Primeur weekly 2017-07-24

Focus

20th Birthday celebrating OpenMP still welcoming new members ...

Fujitsu's processor roadmap is hitting new targets for Deep Learning ...

Focus on Europe

New FPGA programming method delivers five times more computing power ...

Gazprom Neft to utilise capacity at the St Petersburg Polytechnic University supercomputer ...

Tenfold connectivity increase between Ukraine and European research and education network ...

Newly improved Brain Simulation Platform now online ...

Joeri van Leeuwen received a grant and in-kind expertise from the eScience Center for his astronomy project AA-ALERT ...

Middleware

Bright Computing and Brazil-based AMT sign partnership agreement ...

Hardware

7th International Women in HPC workshop ...

IBM scientists observe elusive gravitational effect in solid-state physics ...

New Supermicro Rack Scale Design (RSD) supports high-density, high-performance pooled NVMe storage ...

Inspur announced the new M5 series servers to get businesses ready for the new era of intelligent computing ...

Women, leadership and Flash - Panel and networking event ...

Applications

3D models help scientists gauge flood impact ...

Pulses of electrons manipulate nanomagnets and store information ...

Flashes of light on the dark matter ...

Titan simulations show importance of close 2-way coupling between human and Earth systems ...

A firefly's flash inspires new nanolaser light ...

Simulation reveals universal signature of chaos in ultracold reactions ...

Massive simulation shows HIV capsid interacting with its environment ...

ANSYS, Saudi Aramco and KAUST shatter supercomputing record ...

Scientists use "Piz Daint" simulations to track heavy summer precipitation from the Mediterranean ...

Fernanda Foertter elected SIG HPC Education Vice Chair ...

The Cloud

IBM expands global Cloud data centre presence with four new facilities ...

The Cloud comes to you: AT&T to power self-driving cars, AR/VR and other future 5G applications through edge computing ...

Teradata acquires San Diego-based start-up StackIQ to strengthen Teradata Everywhere and IntelliCloud capabilities ...

IBM mainframe ushers in new era of data protection ...

Solarflare lets server racks match connectivity of a human neuron ...

New high speed interface connects IBM Z to IBM storage systems ...

Oracle significantly expands Cloud at Customer with PaaS and SaaS services to help customers in their journey to the Cloud ...

Fujitsu's processor roadmap is hitting new targets for Deep Learning


20 Jul 2017 Frankfurt - In the session, chaired by Satoshi Matsuoka, on processor technologies for HPC and Artificial Intelligence at ISC'17 in Frankfurt, Germany, Takumi Maruyama from Fujitsu told the audience a little bit more about the K computer, Fujitsu's latest processors for HPC and Unix; and the future Fujitsu processors that are under development for the Post K computer and for Artificial Intelligence.

The K computer has a performance of 10,51 PFlops, which is realized by the 8-core high performance processor. The machine has liquid cooling, a Torus network, and a high density rack, Takumi Maruyama explained.

In processor development, Fujitsu has had a perpetual evolution of over 60 years. The company has been developing processors for mainframe, UNIX, HPC and Artificial Intelligence.

Takumi Maruyama expanded on the SPARC64 Xlfx chip for HPC. This chip has 32 computing cores and 2 assistant cores. The HPC-ACE2 is provided with Fujitsu's ISA enhancements. It has sector cache, meaning that it provides cache with software controllability.

The SPARC64 XII chip has been developed for Unix with 12 cores x 8 threads and Software on Chip. There is 32 MB L3 cache and embedded MAC and IOC with 20nm CMOS.

Takumi Maruyama also described Japan's Post-K computer development project. RIKEN and Fujitsu are currently developing the post-K computer, which is aimed to be the most advanced general-purpose supercomputer in the world. The goals of the project are to provide application performance, low power consumption, user convenience, and the ability to produce ground-breaking results.

The Fujitsu processor, that will be developed for the Post-K computer, is adopting ARM ISA and enhanced Tofu interconnect. This processor inherits and enhances the K computer's innovative features.

The Post-K processor supports FP16, according to Takumi Maruyama. It provides optimized precision for a wide range of applications with superior performance and reduces the required bandwidth and power consumption. The target applications involve existing numerical applications and brand-new applications such as Deep Learning.

The upcoming AI processor, developed by Fujitsu, is called DLU, which stands for Deep Learning Unit. The architecture is designed for Deep Learning with a low power consumption design and optimized precision. The goal is to reach a tenfold performance/watt compared to the competitors, Takumi Maruyama announced. It will have a scalable design with Tofu interconnect technology. This has the ability to handle large-scale neural networks.

The DLU design target is to create a high Deep Learning performance/watt. However, high performance and low power is not easy to achieve at the same time, Takumi Maruyama warned the audience. In fact, these are conflicting demands. For high performance more transistors with a higher frequency are needed in comparison with less transistors and a lower frequency when we talk low power.

This means that a new architecture is required for the DLU to achieve the target. The architecture is domain specific with optimal precision and also massively parallel, Takumi Maruyama explained, evolving from high precision to optimal precision, from sequential to massively parallel, and from general to domain specific. This requires many cores with an on-chip network.

The domain specific cores will be newly designed ISA with a simplified µ-architecture, fully software visible and controllable, using heterogeneous cores, DPE and large RF, according to Takumi Maruyama.

The combination of few large cores and many small execution cores results in more performance with less power consumption, compared to a conventional homogeneous structure. The DPU execution will execute DL operations based on master core's control.

The DPU consists of 16 DPEs connected with on-chip network. The DPE includes large RF and wide SIMD execution units to realize an efficient Deep Learning engine. The RF is fully software controllable unlike the cache to extract the full hardware potential, Takumi Maruyama explained.

Fujitsu's Deep Learning Integer realizes the necessary accuracy for Deep Learning with only a 16- or 8-bit data size. In fact, the Deep Learning Integer has shown similar accuracy with FP32 for Deep Learning.

Takumi Maruyama said there will be multiple generations of DLUs over time, as is currently the case for the HPC, UNIX and mainframe processors.

The Fujitsu processor design style is standard ISA with FJ enhancements and newly developed ISA. It is shared and simple with a software visible micro-architecture. It uses the latest semiconductor technology and has a shared design infrastructure for the circuit and methodology, involving a team of people.

Takumi Maruyama explained that the Fujitsu processor direction is general purpose as well as domain specific. There will be a wider variety of processors in the future to meet different requirements. He promised that Fujitsu will continue to develop cutting-edge processors to meet the needs of a new era.

Leslie Versweyveld

Back to Table of contents

Primeur weekly 2017-07-24

Focus

20th Birthday celebrating OpenMP still welcoming new members ...

Fujitsu's processor roadmap is hitting new targets for Deep Learning ...

Focus on Europe

New FPGA programming method delivers five times more computing power ...

Gazprom Neft to utilise capacity at the St Petersburg Polytechnic University supercomputer ...

Tenfold connectivity increase between Ukraine and European research and education network ...

Newly improved Brain Simulation Platform now online ...

Joeri van Leeuwen received a grant and in-kind expertise from the eScience Center for his astronomy project AA-ALERT ...

Middleware

Bright Computing and Brazil-based AMT sign partnership agreement ...

Hardware

7th International Women in HPC workshop ...

IBM scientists observe elusive gravitational effect in solid-state physics ...

New Supermicro Rack Scale Design (RSD) supports high-density, high-performance pooled NVMe storage ...

Inspur announced the new M5 series servers to get businesses ready for the new era of intelligent computing ...

Women, leadership and Flash - Panel and networking event ...

Applications

3D models help scientists gauge flood impact ...

Pulses of electrons manipulate nanomagnets and store information ...

Flashes of light on the dark matter ...

Titan simulations show importance of close 2-way coupling between human and Earth systems ...

A firefly's flash inspires new nanolaser light ...

Simulation reveals universal signature of chaos in ultracold reactions ...

Massive simulation shows HIV capsid interacting with its environment ...

ANSYS, Saudi Aramco and KAUST shatter supercomputing record ...

Scientists use "Piz Daint" simulations to track heavy summer precipitation from the Mediterranean ...

Fernanda Foertter elected SIG HPC Education Vice Chair ...

The Cloud

IBM expands global Cloud data centre presence with four new facilities ...

The Cloud comes to you: AT&T to power self-driving cars, AR/VR and other future 5G applications through edge computing ...

Teradata acquires San Diego-based start-up StackIQ to strengthen Teradata Everywhere and IntelliCloud capabilities ...

IBM mainframe ushers in new era of data protection ...

Solarflare lets server racks match connectivity of a human neuron ...

New high speed interface connects IBM Z to IBM storage systems ...

Oracle significantly expands Cloud at Customer with PaaS and SaaS services to help customers in their journey to the Cloud ...