Back to Table of contents

Primeur live 2017-11-14

Exascale

Mellanox deployment collaboration with Lenovo will power Canada's largest supercomputer centre with leading performance, scalability for High Performance Computing applications ...

Middleware

Scalable clusters make HPC R&D easy as Raspberry Pi ...

NVIDIA chosen by every major computer maker and every major Cloud ...

WekaIO announces native support for Mellanox InfiniBand and Ethernet intelligent interconnect solutions ...

Bright Computing announces new product to help get enterprise data scientists up and running quickly with Deep Learning ...

OpenMP Architecture Review Board releases Technical Report that addresses top user requests ...

Diagnosing supercomputer problems ...

Hardware

Oak Ridge National Laboratory acquires Atos Quantum Learning Machine to support US Department of Energy research ...

Cavium and partners to showcase ThunderX2 Arm-based server platforms and FastLinQ Ethernet adapters for High Performance Computing at SC17 ...

HPE helps businesses capitalize on High Performance Computing and Artificial Intelligence applications with new high-density compute and storage ...

SciNet relies on Excelero for high-performance, peta-scale storage at new supercomputing facility ...

CoolIT Systems announces liquid cooled Intel Buchanan Pass server ...

Applications

Supercomputing speeds up Deep Learning training ...

INCITE grants of 5.95 billion hours awarded to 55 computational research projects ...

The Cloud

Penguin Computing announces Intel Xeon Scalable processor availability for Penguin Computing On-Demand HPC Cloud ...

Company news

Nallatech showcases next generation FPGA accelerators at Supercomputing 2017 ...

Cray supercomputer to assist Samsung's research on Artificial Intelligence and Deep Learning ...

Lenovo accelerates Artificial Intelligence initiatives to solve humanity's greatest challenges ...

DDN strengthens its HPC storage leadership with new solutions and next generation monitoring tools ...

New Dell EMC solutions bring machine and deep learning to mainstream enterprises ...

SciNet relies on Excelero for high-performance, peta-scale storage at new supercomputing facility

13 Nov 2017 Denver - At the SC17 Conference, Excelero, a disruptor in software-defined block storage, announced that its customer SciNet has deployed Excelero's NVMesh server SAN for the highly efficient, cost-effective storage behind a new supercomputer at the University of Toronto. By using NVMesh for burst buffer - a storage architecture that helps ensure high availability and high ROI, SciNet created a unified pool of distributed high-performance NVMe flash that retains the speeds and latency of directly attached storage media, while meeting the demanding service level agreements (SLAs) for the new supercomputer.

"For SciNet, NVMesh is an extremely cost-effective method of achieving unheard-of burst buffer bandwidth", stated Dr. Daniel Gruner, chief technical officer, SciNet High Performance Computing Consortium. "By adding commodity flash drives and NVMesh software to compute nodes, and to a low-latency network fabric that was already provided for the supercomputer itself, NVMesh provides redundancy without impacting target CPUs. This enables standard servers to go beyond their usual role in acting as block targets - the servers now can also act as file servers."

Based in Toronto, SciNet, Canada's largest supercomputer centre, serves thousands of researchers in biomedical, aerospace, climate sciences, and more. Their large-scale modeling, simulation, analysis and visualization applications sometimes run for weeks, and interruptions can sometimes destroy the result of an entire job. To avoid interruption SciNet implemented a burst buffer - a fast intermediate layer between the non-persistent memory of the compute nodes and the storage - to enable fast checkpointing, so that computing jobs can be easily restarted. SciNet had deployed the Spectrum Scale (GPFS) shared parallel file system on their spinning disk system, but at scale, as individual jobs become larger, checkpointing may take too long to complete, making the calculation difficult, or even impossible to carry out.

Using Excelero's NVMesh in a burst buffer implementation, SciNet created a peta-scale storage system that leverages the full performance of NVMe SSDs at scale, over the network - easily meeting SLA requirements for completing checkpoints in 15 minutes, without needing costly proprietary arrays. With NVMesh, SciNet created a unified, distributed pool of NVMe flash storage comprised of 80 NVMe devices in just 10 NSD protocol-supporting servers. This provided approximately 148 GB/s of write burst (device limited) and 230GB /s of read throughput (network limited) - in addition to well over 20M random 4K iOPS.

Emulating the "shared nothing" architectures of the Tech Giants, SciNet's NVMesh deployment allows them to use hardware from any storage, server and networking vendor, eliminating vendor lock-in. Integration with SciNet's parallel file system is straightforward, and the system enables SciNet to scale both capacity and performance linearly as its research load grows.

"Mellanox interconnect solutions include smart and scalable NVMe accelerations that enable users to maximize their storage performance and efficiency", stated Gilad Shainer, vice president of marketing at Mellanox Technologies. "Leveraging the advantages of InfiniBand, Excelero delivers world leading NVMe platforms, accelerating the next generations of supercomputers."

"In supercomputing any unavailability wastes time, reduces the availability score of the system and impedes the progress of scientific exploration. We're delighted to provide SciNet and its researchers with important storage functionality that achieves the highest performance available in the industry at a significantly reduced price - while assuring vital scientific research can progress swiftly", stated Lior Gal, CEO and co-founder at Excelero.
Source: Excelero

Back to Table of contents

Primeur live 2017-11-14

Exascale

Mellanox deployment collaboration with Lenovo will power Canada's largest supercomputer centre with leading performance, scalability for High Performance Computing applications ...

Middleware

Scalable clusters make HPC R&D easy as Raspberry Pi ...

NVIDIA chosen by every major computer maker and every major Cloud ...

WekaIO announces native support for Mellanox InfiniBand and Ethernet intelligent interconnect solutions ...

Bright Computing announces new product to help get enterprise data scientists up and running quickly with Deep Learning ...

OpenMP Architecture Review Board releases Technical Report that addresses top user requests ...

Diagnosing supercomputer problems ...

Hardware

Oak Ridge National Laboratory acquires Atos Quantum Learning Machine to support US Department of Energy research ...

Cavium and partners to showcase ThunderX2 Arm-based server platforms and FastLinQ Ethernet adapters for High Performance Computing at SC17 ...

HPE helps businesses capitalize on High Performance Computing and Artificial Intelligence applications with new high-density compute and storage ...

SciNet relies on Excelero for high-performance, peta-scale storage at new supercomputing facility ...

CoolIT Systems announces liquid cooled Intel Buchanan Pass server ...

Applications

Supercomputing speeds up Deep Learning training ...

INCITE grants of 5.95 billion hours awarded to 55 computational research projects ...

The Cloud

Penguin Computing announces Intel Xeon Scalable processor availability for Penguin Computing On-Demand HPC Cloud ...

Company news

Nallatech showcases next generation FPGA accelerators at Supercomputing 2017 ...

Cray supercomputer to assist Samsung's research on Artificial Intelligence and Deep Learning ...

Lenovo accelerates Artificial Intelligence initiatives to solve humanity's greatest challenges ...

DDN strengthens its HPC storage leadership with new solutions and next generation monitoring tools ...

New Dell EMC solutions bring machine and deep learning to mainstream enterprises ...