Back to Table of contents

Primeur weekly 2020-01-06

Focus

The LUMI supercomputer is not just a very fast supercomputer, it is first of all a competence development platform - Interview with Kimmo Koski, CSC, Finland ...

Quantum computing

ORNL researchers advance performance benchmark for quantum computers ...

In leap for quantum computing, silicon quantum bits establish a long-distance relationship ...

The Quantum Information Edge launches to accelerate quantum computing R&D ...

Focus on Europe

The coolest LEGO in the universe ...

Middleware

BP looks to ORNL and ADIOS to help rein in data ...

Hardware

New year brings new directory structure for OLCF's high-performance storage system ...

GIGABYTE brings AI, Cloud solutions and smart applications to CES 2020 to enable future today ...

During its final hours of operation, the Titan supercomputer simulated the birth of supernovae ...

Big iron afterlife: How ORNL's Titan supercomputer was recycled ...

Applications

Stanford researchers build a particle accelerator that fits on a chip ...

Brain-like functions emerging in a metallic nanowire network ...

Award-winning engineer helps keep US nuclear deterrent safe from radiation ...

New algorithm could mean more efficient, accurate equipment for Army ...

Paul Ginsparg named winner of the 2020 AIP Karl Compton Medal ...

'Super' simulations offer fresh insight into serotonin receptors ...

Researchers accelerate plasma turbulence simulations on Oak Ridge supercomputers to improve fusion design models ...

BP looks to ORNL and ADIOS to help rein in data


2 Jan 2020 Oak Ridge - Researchers across the scientific spectrum crave data, as it is essential to understanding the natural world and, by extension, accelerating scientific progress. Lately, however, the tools of scientific endeavor have become so powerful that the amount of data obtained from experiments and observations is often unwieldy.

In other words, it is possible to have too much of a good thing.

Making sense of today's ballooning datasets has become a major scientific challenge in its own right, forcing researchers to not only tackle their domain science problems but also the problem of managing and processing their ever-growing datasets. Just ask researchers at BP, who are tasked with finding natural gas and oil in the ground and figuring out how best to extract it.

"New technologies in the field allow us to collect more data than we ever dreamed of", stated BP HPC Computational Scientist Vladimir Bashkardin, referencing the properties of subsurface fluid and rocks obtained via energy responses to the company's probing. "We need to scale our ability to access large seismic datasets, which can measure half a petabyte at times."

To assist them in this monumental effort Vladimir Bashkardin and his colleagues turned to the Department of Energy's Oak Ridge National Laboratory (ORNL), home to Summit, the world's most powerful and "smartest" computer, and a wealth of expertise on how to manage and process today’s large and complex scientific datasets.

Summit's debut marked the third time the laboratory has stood up the world's fastest supercomputer. These systems have been used to tackle some of the most pressing scientific challenges of our time including fusion energy, drug delivery, and the design of novel materials, efforts that have also made ORNL a world leader in the increasingly important arena of Big Data.

BP researchers turned to ORNL Scientific Data Group Leader Scott Klasky and ORNL Scientific Data Management Team Lead Norbert Podhorszki, principal investigators behind the Adaptable I/O System (ADIOS), an I/O middleware that has helped researchers achieve scientific breakthroughs by providing a simple, flexible way to describe data in their code that may need to be written, read, or processed outside of the running simulation.

BP invited Scott Klasky and Norbert Podhorszki to its Houston offices to give the company's high-performance computing team a tutorial of ADIOS and demonstrate how it could help them accelerate their science by helping tackle their large, unique seismic datasets.

"The workshop was awesome", stated BP HPC Technology Analyst Bosen Du. "It was a great introduction to ADIOS, and we definitely saw plenty of possible opportunities to apply it to our specific challenges. Even better, Scott and Norbert asked specific questions to personalize the tutorial to BP."

Scott Klasky shared Bosen Du's enthusiasm. "This was the one of the more enjoyable tutorials we have given due to the level of interest from everyone in the room", he stated, adding that BP's interest led to what is likely the longest tutorial the team has ever given.

Scott Klasky and Nobert Podhorszki's trip was the result of a growing relationship between ORNL and BP.

BP's Director of HPC, Keith Gray, was already familiar with ORNL's Oak Ridge Leadership Computing Facility, the DOE Office of Science User Facility that is home to Summit, through the positive testimonials of colleagues who had participated in its Industrial Partnership Programme ACCEL - Accelerating Competitiveness through Computational ExceLlence.

Keith Gray even visited ORNL two years ago to give a guest lecture on how BP's data centre needs are smaller but similar to those of a centre like the Oak Ridge Leadership Computing Facility (OLCF) and on the importance of a reliable data centre to support BP's commitment to being at the forefront of supercomputing technology.

That relationship, along with ADIOS's unique capabilities, made the choice an easy one. "We started doing research and ADIOS was always at the top of the list", stated Keith Gray, adding: "By collaborating, BP's world-class expertise in applying HPC to solve complex scientific problems could help the ADIOS team understand different workflows as they help us manage our data."

Managing that data is critical from a business perspective. In one recent project the BP team faced a 500-terabyte dataset. And that's before seismic processing, after which the dataset can grow ten-fold.

"Having something that can scale, do massively parallel I/O, and support compression would be a major advantage in helping us overcome our current data issues", stated Vladimir Bashkardin. MGARD, a technique developed jointly by ORNL and Brown University that is used for lossy compression of scientific data and which mathematically guarantees error bounds, seemed a particularly good fit for BP's compression issues, said Scott Klasky.

He added that recent changes in ADIOS, made possible by the Exascale Computing Project, have helped the SPECFEM3D-Globe seismology code used by Princeton's Jeroen Tromp achieve a speed of more than 2 terabytes per second while writing data to Summit's general parallel file system. Such a speed could lead to further collaboration with Jeroen Tromp's team, which utilizes ADIOS as the I/O backend, and help strengthen the data processing capability for a large part of the seismology community.

Overcoming issues such as I/O bottlenecks means a reduction in data analysis turnaround time, which would allow the company to explore different ideas, identify and address bottlenecks, and achieve a better understanding of the subsurface. Taken together, these capabilities can create huge breakthroughs for BP's research programme.

But a successful implementation of ADIOS into BP's current I/O code, dubbed the Data Dictionary System, would be beneficial in the short run as well. For instance, it would give their team valuable insight into whether they are pursuing the correct technologies and strategies to succeed.

"It may help us consider building additional file systems to deliver more bandwidth than our current clusters", stated Keith Gray, adding that "you don't need new file systems if your I/O is at peak, and we currently don't have all of the necessary I/O metrics". Researchers from the ORNL team have agreed to provide some support in helping BP to assess its data strategy.

Added Vladimir Bashkardin: "We struggle with extracting I/O bandwidth out of our Lustre file system due to a number of factors. There's lots to be gained in these terms. Even doubling the performance with a single dataset would be an enormous improvement."

In theory, ADIOS could expedite some jobs from days to hours, fundamentally altering the workflows of BP's seismic researchers. And, according to BP HPC Computational Specialist Qingquing Liao, the middleware's built-in visualization capability is an excellent tool that pinpoints problematic areas of researchers' codes and models to help them best understand how to alter their algorithms. Scott Klasky credits his colleagues Lipeng Wan and William Godoy for this capability, which allows users to instantly transition from file-based code coupling - e.g. asynchronously coupling a code to visualization - to in-memory coupling without changing their code.

But before ADIOS can be implemented, the BP team will need to specify what viable features they want to see on their I/O backend and create a new API layer with a specific set of API goals.

"Being able to leverage ORNL's ADIOS and working together to improve it will extend BP's expertise in using Big Data to solve critical energy problems", stated Keith Gray.

The team's research has been funded by the DOE's Advanced Scientific Computing Research programme, the Oak Ridge Leadership Computing Facility, and the Exascale Computing Project (ECP).
Source: Oak Ridge National Laboratory - ORNL

Back to Table of contents

Primeur weekly 2020-01-06

Focus

The LUMI supercomputer is not just a very fast supercomputer, it is first of all a competence development platform - Interview with Kimmo Koski, CSC, Finland ...

Quantum computing

ORNL researchers advance performance benchmark for quantum computers ...

In leap for quantum computing, silicon quantum bits establish a long-distance relationship ...

The Quantum Information Edge launches to accelerate quantum computing R&D ...

Focus on Europe

The coolest LEGO in the universe ...

Middleware

BP looks to ORNL and ADIOS to help rein in data ...

Hardware

New year brings new directory structure for OLCF's high-performance storage system ...

GIGABYTE brings AI, Cloud solutions and smart applications to CES 2020 to enable future today ...

During its final hours of operation, the Titan supercomputer simulated the birth of supernovae ...

Big iron afterlife: How ORNL's Titan supercomputer was recycled ...

Applications

Stanford researchers build a particle accelerator that fits on a chip ...

Brain-like functions emerging in a metallic nanowire network ...

Award-winning engineer helps keep US nuclear deterrent safe from radiation ...

New algorithm could mean more efficient, accurate equipment for Army ...

Paul Ginsparg named winner of the 2020 AIP Karl Compton Medal ...

'Super' simulations offer fresh insight into serotonin receptors ...

Researchers accelerate plasma turbulence simulations on Oak Ridge supercomputers to improve fusion design models ...