Deep learning is used across industries and the research community to help solve many big data problems such as natural language processing, speech recognition, computer vision, healthcare, life-sciences, financial services and more. Mellanox is enabling these industries into a new era of performance and scalability with the powerful data-centric offload architecture that has been employed by the world's most advanced machine learning platforms.
TensorFlow is an open source software library originally developed by researchers and engineers within Google's Machine Intelligence research group. With the inclusion of RDMA technology in place of traditional TCP, TensorFlow data exchange performance between nodes was accelerated by 2x, enabling faster image processing.
Baidu's PaddlePaddle - Parallel Distributed Deep Learning - is a flexible and scalable deep learning platform. PaddlePaddle supports a wide range of neural network architectures and optimization algorithms, such that it is possible to leverage many CPUs and GPUs to accelerate training. PaddlePaddle leverages RDMA to achieve high throughput and performance, and takes advantage of the more advanced acceleration capabilities of the combined NVIDIA and Mellanox architectures to accelerate deep learning training time by 2x.
"Advanced deep neural networks depend upon the capabilities of smart interconnect to scale to multiple nodes, and move data as fast as possible, which speeds up algorithms and reduces training time", stated Gilad Shainer, vice president of marketing at Mellanox Technologies. "By leveraging Mellanox technology and solutions, clusters of machines are now able to learn at a speed, accuracy and scale that push the boundaries of the most demanding cognitive computing applications."
"Developers of deep learning applications can take advantage of optimized frameworks and NVIDIA's upcoming NCCL 2.0 library which implements native support for InfiniBand verbs and automatically selects GPUDirect RDMA for multi-node or NVIDIA NVLink when available for intra-node communications", stated Duncan Poole, Director of Platform Alliances at NVIDIA. "NVIDIA NVLink is available in Pascal-based Tesla P100 systems, including the NVIDIA DGX-1 AI supercomputer which has four Mellanox ConnectX-4 100 Gb/s adapters. This allows developers to focus on creating new algorithms and software capabilities, rather than performance tuning low-level communication collectives."