"The powerful trends of Cloud computing and AI are driving a tectonic shift in data centre designs so that what was once a sea of CPU-only servers is now GPU-accelerated computing", stated Jensen Huang, founder and CEO of NVIDIA. "NVIDIA A100 GPU is a 20x AI performance leap and an end-to-end machine learning accelerator - from data analytics to training to inference. For the first time, scale-up and scale-out workloads can be accelerated on one platform. NVIDIA A100 will simultaneously boost throughput and drive down the cost of data centres."
New elastic computing technologies built into A100 make it possible to bring right-sized computing power to every job. A multi-instance GPU capability allows each A100 GPU to be partitioned into as many as seven independent instances for inferencing tasks, while third-generation NVIDIA NVLink interconnect technology allows multiple A100 GPUs to operate as one giant GPU for ever larger training tasks.
The world's leading Cloud service providers and systems builders that expect to incorporate A100 GPUs into their offerings include: Alibaba Cloud, Amazon Web Services (AWS), Atos, Baidu Cloud, Cisco, Dell Technologies, Fujitsu, GIGABYTE, Google Cloud, H3C, Hewlett Packard Enterprise (HPE), Inspur, Lenovo, Microsoft Azure, Oracle, Quanta/QCT, Supermicro and Tencent Cloud.
Among the first to tap into the power of NVIDIA A100 GPUs is Microsoft, which will take advantage of their performance and scalability.
"Microsoft trained Turing NLG, the largest language model in the world, using the current generation of NVIDIA GPUs", stated Mikhail Parakhin, corporate vice president at Microsoft. "We will train dramatically bigger AI models using thousands of NVIDIA's new generation of A100 GPUs in Azure at scale to push the state of the art on language, speech, vision and multi-modality."
DoorDash, an on-demand food platform serving as a lifeline to restaurants during the pandemic, notes the importance of having a flexible AI infrastructure.
"Modern and complex AI training and inference workloads that require a large amount of data can benefit from state-of-the-art technology like NVIDIA A100 GPUs, which help reduce model training time and speed up the machine learning development process", stated Gary Ren, machine learning engineer at DoorDash. "In addition, using Cloud-based GPU clusters gives us newfound flexibility to scale up or down as needed, helping to improve efficiency, simplify our operations and save costs."
Other early adopters include national laboratories and some of the world's leading higher education and research institutions, each using A100 to power their next-generation supercomputers. They include:
The NVIDIA A100 GPU is a technical design breakthrough fueled by five key innovations:
Together, these new features make the NVIDIA A100 ideal for diverse, demanding workloads, including AI training and inference as well as scientific simulation, conversational AI, recommender systems, genomics, high-performance data analytics, seismic modelling and financial forecasting.
The NVIDIA DGX A100 system, also announced today, features eight NVIDIA A100 GPUs interconnected with NVIDIA NVLink. It is available immediately from NVIDIA and approved partners.
Alibaba Cloud, AWS, Baidu Cloud, Google Cloud, Microsoft Azure, Oracle and Tencent Cloud are planning to offer A100-based services.
Additionally, a wide range of A100-based servers are expected from the world's leading systems manufacturers, including Atos, Cisco, Dell Technologies, Fujitsu, GIGABYTE, H3C, HPE, Inspur, Lenovo, Quanta/QCT and Supermicro.
To help accelerate development of servers from its partners, NVIDIA has created HGX A100 - a server building block in the form of integrated baseboards in multiple GPU configurations.
The four-GPU HGX A100 offers full interconnection between GPUs with NVLink, while the eight-GPU configuration offers full GPU-to-GPU bandwidth through NVIDIA NVSwitch. HGX A100, with the new MIG technology, can be configured as 56 small GPUs, each faster than NVIDIA T4, all the way up to a giant eight-GPU server with 10 petaflops of AI performance.
NVIDIA also announced several updates to its software stack enabling application developers to take advantage of A100 GPU's innovations. They include new versions of more than 50 CUDA-X libraries used to accelerate graphics, simulation and AI; CUDA 11; NVIDIA Jarvis, a multimodal, conversational AI services framework; NVIDIA Merlin, a deep recommender application framework; and the NVIDIA HPC SDK, which includes compilers, libraries and tools that help HPC developers debug and optimize their code for A100.