Back to Table of contents

Primeur weekly 2018-04-03

Quantum computing

Putting quantum scientists in the driver's seat ...

Focus on Europe

Tuesday and Wednesday keynotes announced for ISC 2018 ...

PRACE SHAPE 7th Call for Applications opens from 3 April to 1 June 2018 ...

Cray commissioned to deliver FPGA-accelerated supercomputer to Paderborn University ...

Developing the technology for future smart cities and autonomous cars ...

Philips Research-led big data consortium receives EU funding to improve healthcare outcomes ...

Middleware

Bright partners with Bechtle to offer infrastructure management solutions to French customer base ...

The Linux Foundation and open source community members launch LF Deep Learning to drive open source growth in AI ...

Hardware

NVIDIA boosts world's leading deep learning computing platform, bringing 10x performance gain in six months ...

NVIDIA expands its deep learning inference capabilities for hyperscale data centres ...

DDN Storage announces groundbreaking 33GB/s performance to NVIDIA DGX servers to accelerate machine learning and AI initiatives ...

DDN and SQream partner to deliver the world's fastest Big Data analytics and enterprise business intelligence acceleration at massive scale ...

DDN Storage helps Standard Cognition revolutionize the consumer shopping experience with fully autonomous check-out ...

Supermicro's new scale-up artificial intelligence and machine learning systems with 8 NVIDIA Tesla V100 with NVLink GPUs deliver superior performance and system density ...

NVIDIA reinvents the workstation with real-time ray tracing ...

Penguin Computing receives Americas 2017 NVIDIA Partner Network High Performance Computing Partner of the Year Award ...

Molecular basis of neural memory - reviewing 'neuro-mimetic' technologies ...

Asperitas and Boston announce Immersed Computing partnership ...

NVIDIA and Arm partner to bring deep learning to billions of IoT devices ...

Applications

ANSYS to acquire optical simulation leader OPTIS ...

NCSA's Donna Cox wins 2018 Innovation Transfer Award ...

The future of photonics using quantum dots ...

Chemical synthesis with artificial intelligence: Researchers develop new computer method ...

Hong Kong Polytechnic University and Australian partners jointly launch impactful research on blockchain technologies ...

New Cray artificial intelligence offerings designed to accelerate customers' AI from pilot to production ...

Overcoming a battery's fatal flaw ...

The Cloud

DDN named Data Centre Platform partner of the year at Intel Technology Partner Awards, recognizing its market leadership at scale ...

Oracle redefines the Cloud database category with world's first autonomous database ...

The Linux Foundation announces expanded industry commitment to Akraino Edge Stack ...

OpenContrail is now "Tungsten Fabric" and completes move to The Linux Foundation ...

NVIDIA expands its deep learning inference capabilities for hyperscale data centres


27 Mar 2018 Silicon Valley - NVIDIA has introduced a series of new technologies and partnerships that expand its potential inference market to 30 million hyperscale servers worldwide, while dramatically lowering the cost of delivering deep learning-powered services.

Speaking at the opening keynote of GTC 2018, NVIDIA founder and CEO Jensen Huang described how GPU acceleration for deep learning inference is gaining traction, with new support for capabilities such as speech recognition, natural language processing, recommender systems, and image recognition - in data centres and automotive applications, as well as in embedded devices like robots and drones.

NVIDIA announced a new version of its TensorRT inference software, and the integration of TensorRT into Google's popular TensorFlow framework. NVIDIA also announced that Kaldi, the most popular framework for speech recognition, is now optimized for GPUs. NVIDIA's close collaboration with partners such as Amazon, Facebook and Microsoft make it easier for developers to take advantage of GPU acceleration using ONNX and WinML.

"GPU acceleration for production deep learning inference enables even the largest neural networks to be run in real time and at the lowest cost", stated Ian Buck, vice president and general manager of Accelerated Computing at NVIDIA. "With rapidly expanding support for more intelligent applications and frameworks, we can now improve the quality of deep learning and help reduce the cost for 30 million hyperscale servers."

NVIDIA unveiled TensorRT 4 software to accelerate deep learning inference across a broad range of applications. TensorRT offers highly accurate INT8 and FP16 network execution, which can cut data centre costs by up to 70 percent.

TensorRT 4 can be used to rapidly optimize, validate and deploy trained neural networks in hyperscale data centres, embedded and automotive GPU platforms. The software delivers up to 190x faster deep learning inference compared with CPUs for common applications such as computer vision, neural machine translation, automatic speech recognition, speech synthesis and recommendation systems.

To further streamline development, NVIDIA and Google engineers have integrated TensorRT into TensorFlow 1.7, making it easier to run deep learning inference applications on GPUs.

Rajat Monga, engineering director at Google, stated: "The TensorFlow team is collaborating very closely with NVIDIA to bring the best performance possible on NVIDIA GPUs to the deep learning community. TensorFlow's integration with NVIDIA TensorRT now delivers up to 8x higher inference throughput (compared to regular GPU execution within a low-latency target) on NVIDIA deep learning platforms with Volta Tensor Core technology, enabling the highest performance for GPU inference within TensorFlow."

NVIDIA has optimized the world's leading speech framework, Kaldi, to achieve faster performance running on GPUs. GPU speech acceleration will mean more accurate and useful virtual assistants for consumers, and lower deployment costs for data centre operators.

Developers at a wide spectrum of companies around the world are using TensorRT to discover new insights from data and to deploy intelligent services to businesses and consumers.

NVIDIA engineers have worked closely with Amazon, Facebook and Microsoft to ensure developers using ONNX frameworks such as Caffe 2, Chainer, CNTK, MXNet and Pytorch can now easily deploy to NVIDIA deep learning platforms.

Markus Noga, head of Machine Learning at SAP, stated: "In our evaluation of TensorRT running our deep learning-based recommendation application on NVIDIA Tesla V100 GPUs, we experienced a 45x increase in inference speed and throughput compared with a CPU-based platform. We believe TensorRT could dramatically improve productivity for our enterprise customers."

Nicolas Koumchatzky, head of Twitter Cortex, stated: "Using GPUs made it possible to enable media understanding on our platform, not just by drastically reducing media deep learning models training time, but also by allowing us to derive real-time understanding of live videos at inference time."

Microsoft also recently announced AI support for Windows 10 applications. NVIDIA partnered with Microsoft to build GPU-accelerated tools to help developers incorporate more intelligent features in Windows applications.

NVIDIA also announced GPU acceleration for Kubernetes to facilitate enterprise inference deployment on multi-Cloud GPU clusters. NVIDIA is contributing GPU enhancements to the open-source community to support the Kubernetes ecosystem.

In addition, MathWorks, makers of MATLAB software, today announced TensorRT integration with MATLAB. Engineers and scientists can now automatically generate high-performance inference engines from MATLAB for the NVIDIA DRIVE, Jetson, and Tesla platforms.

Data centre managers constantly balance performance and efficiency to keep their server fleets at maximum productivity. NVIDIA Tesla GPU-accelerated servers can replace several racks of CPU servers for deep learning inference applications and services, freeing up precious rack space and reducing energy and cooling requirements.

TensorRT can also be deployed on NVIDIA DRIVE autonomous vehicles and NVIDIA Jetson embedded platforms. Deep neural networks on every framework can be trained on NVIDIA DGX systems in the data centre, and then deployed into all types of devices - from robots to autonomous vehicles - for real-time inferencing at the edge.

With TensorRT, developers can focus on developing novel deep learning-powered applications rather than performance tuning for inference deployment. Developers can use TensorRT to deliver lightning-fast inference using INT8 or FP16 precision that significantly reduces latency, which is vital for capabilities like object detection and path planning on embedded and automotive platforms.
Source: NVIDIA

Back to Table of contents

Primeur weekly 2018-04-03

Quantum computing

Putting quantum scientists in the driver's seat ...

Focus on Europe

Tuesday and Wednesday keynotes announced for ISC 2018 ...

PRACE SHAPE 7th Call for Applications opens from 3 April to 1 June 2018 ...

Cray commissioned to deliver FPGA-accelerated supercomputer to Paderborn University ...

Developing the technology for future smart cities and autonomous cars ...

Philips Research-led big data consortium receives EU funding to improve healthcare outcomes ...

Middleware

Bright partners with Bechtle to offer infrastructure management solutions to French customer base ...

The Linux Foundation and open source community members launch LF Deep Learning to drive open source growth in AI ...

Hardware

NVIDIA boosts world's leading deep learning computing platform, bringing 10x performance gain in six months ...

NVIDIA expands its deep learning inference capabilities for hyperscale data centres ...

DDN Storage announces groundbreaking 33GB/s performance to NVIDIA DGX servers to accelerate machine learning and AI initiatives ...

DDN and SQream partner to deliver the world's fastest Big Data analytics and enterprise business intelligence acceleration at massive scale ...

DDN Storage helps Standard Cognition revolutionize the consumer shopping experience with fully autonomous check-out ...

Supermicro's new scale-up artificial intelligence and machine learning systems with 8 NVIDIA Tesla V100 with NVLink GPUs deliver superior performance and system density ...

NVIDIA reinvents the workstation with real-time ray tracing ...

Penguin Computing receives Americas 2017 NVIDIA Partner Network High Performance Computing Partner of the Year Award ...

Molecular basis of neural memory - reviewing 'neuro-mimetic' technologies ...

Asperitas and Boston announce Immersed Computing partnership ...

NVIDIA and Arm partner to bring deep learning to billions of IoT devices ...

Applications

ANSYS to acquire optical simulation leader OPTIS ...

NCSA's Donna Cox wins 2018 Innovation Transfer Award ...

The future of photonics using quantum dots ...

Chemical synthesis with artificial intelligence: Researchers develop new computer method ...

Hong Kong Polytechnic University and Australian partners jointly launch impactful research on blockchain technologies ...

New Cray artificial intelligence offerings designed to accelerate customers' AI from pilot to production ...

Overcoming a battery's fatal flaw ...

The Cloud

DDN named Data Centre Platform partner of the year at Intel Technology Partner Awards, recognizing its market leadership at scale ...

Oracle redefines the Cloud database category with world's first autonomous database ...

The Linux Foundation announces expanded industry commitment to Akraino Edge Stack ...

OpenContrail is now "Tungsten Fabric" and completes move to The Linux Foundation ...