With the spread of deep learning in recent years, there has been a demand for algorithms that can execute machine learning processing at high speeds, and the speed of deep learning has accelerated by 30 times in the past two years. ResNet-50, a deep neural network for image recognition, is generally used as a benchmark to measure deep learning processing speed, comparing training times using image data from the ImageNet Large Scale Visual Recognition Challenge 2012 (ILSVRC2012), a contest of image recognition accuracy.
Based on the technology Fujitsu Laboratories has cultivated over its HPC development, the company has now developed a technology to expand computation volume per GPU without compromising training accuracy. Highly-efficient distributed parallel processing can be provided by appropriately adjusting the learning rate in accordance to the degree of the deep learning training progress. When this newly developed technology was applied to open source deep learning software using 2,048 GPUs in the ABCI system and measured for this benchmark, Fujitsu Laboratories confirmed that it beats the previous speed record by more than 30 seconds, completing the training in 74.7 seconds, the world's highest speed.
Fujitsu Laboratories will endeavor to further increase the speed of deep learning, aiming to implement practical applications of this newly developed technology for Fujitsu's servers and supercomputers.