NVIDIA T4 is being used to accelerate AI inference and training in a broad range of fields, including health care, finance and retail, which are key elements in the global high performance computing market for enterprise and hyperscale.
This follows NVIDIA's announcement at the recent SC18 supercomputing show that, just two months after its introduction, T4 is featured in 57 separate server designs from the world's leading computer makers. Additionally, Google Cloud announced T4 availability to its Google Cloud Platform customers.
Among previously announced server companies featuring the NVIDIA T4 are Dell EMC, Hewlett Packard Enterprise, IBM, Lenovo and Supermicro.
"The continued rapid adoption of T4 makes complete sense, given its unprecedented capabilities", stated Ian Buck, vice president of Accelerated Computing at NVIDIA. "Never before have we introduced a GPU that gives public and private clouds the combined performance and energy efficiency they need to more economically run their compute-intensive workloads at scale. And in markets where scale really counts, we expect T4 to be extremely popular."
Based on the new NVIDIA Turing architecture, the T4 GPU features multi-precision Turing Tensor Cores and new RT Cores, which, when combined with accelerated containerized software stacks, deliver unprecedented performance at scale.
Among China server companies featuring T4 GPUs are Inspur, Huawei, Lenovo, Sugon, Inspur Power System and H3C. Their new systems include:
Systems are expected to begin shipping before the end of the year.
Designed to meet the unique needs of scale-out public and enterprise cloud environments, NVIDIA T4 maximizes throughput, utilization and user concurrency, helping customers efficiently address exploding user and data growth.
Roughly the size of a candy bar, the low-profile, 70-watt T4 GPU has the flexibility to fit into a standard server or any Open Compute Project hyperscale server design. Server designs can range from a single T4 GPU all the way up to 20 GPUs in a single node.
The T4 GPU's multi-precision capabilities power breakthrough AI performance for a wide range of AI workloads at four different levels of precision, offering 8.1 TFLOPS at FP32, 65 TFLOPS at FP16 as well as 130 TOPS of INT8 and 260 TOPS of INT4. For AI inference workloads, a server with two T4 GPUs can replace up to 54 CPU-only servers. For AI training, a server with two T4 GPUs can replace nine dual-socket, CPU-only servers.