Back to Table of contents

Primeur weekly 2015-06-01

The Cloud

Registration opens for ISC Cloud & Big Data ...

EMC to acquire Virtustream ...

EuroFlash

Snapwallet: a safe photo service for smart phone ...

Dutch Government plans 135 million euro funding for supercomputers ...

Tackling the fastest and most powerful computing systems on the planet ...

Understanding and controlling the propagation of waves ...

USFlash

Lawrence Livermore breaks ground on unclassified supercomputing facility ...

Silicon Mechanics announces recipients of 4th Annual Research Cluster Grant ...

RAPTOR turbulent combustion code selected for next-gen supercomputer readiness project ...

Cavium announces collaboration with Pegatron on server platforms based on Cavium's ThunderX workload optimized processor family ...

Ohio State University researchers prove magnetism can control heat and sound ...

OLCF names CAAR projects at IDC HPC User Forum ...

Physicists solve quantum tunneling mystery ...

Using the forest to see the trees ...

Tsinghua University crowned champions of 2015 ASC Student Supercomputer Challenge ...

OLCF shares Lustre knowledge at International Workshop ...

National Science Foundation extends the Kraken project ...

Woodside, Australia's largest independent oil and gas company, uses IBM Watson to enhance decision making and increase efficiencies ...

OLCF outreach projects garner awards ...

The first round of 2015 hackathons gets underway ...

Premier announces $21.6 million funding for Pawsey Supercomputing Centre ...

Fujitsu supports King Abdulaziz University research capabilities with new supercomputing system ...

Doctor Evidence brings valuable health data to IBM Watson ecosystem ...

SRC Computers launches Saturn 1 Server, the first reconfigurable hyperscale server ...

BG Brasil and Senai CIMATEC launch Latin America's fastest supercomputer ...

OLCF shares Lustre knowledge at International Workshop


21 May 2015 Oak Ridge - High-performance supercomputers need high-performance file systems to manage the movement and storage of large amounts of data. For many of the fastest supercomputers in the world, including Titan at the US Department of Energy's (DOE's) Oak Ridge National Laboratory (ORNL), the Lustre parallel file system fills that need.

Because of its open source licensing, ability to reduce I/O constraints, and scalability, Lustre has been adopted widely by high-performance computing (HPC) users worldwide. But as the needs of HPC users evolve, so too must Lustre.

To that end, the Oak Ridge Leadership Computing Facility (OLCF), a DOE Office of Science User Facility, played a significant role in an ORNL event to share knowledge and discuss the future development of the parallel file system. The International Workshop on the Lustre Ecosystem: Challenges and Opportunities, which took place March 3 and 4 in Annapolis, Maryland, brought together Lustre users from academia, industry, and government to explore improvements in the parallel file system's performance and flexibility. OLCF staff members gave talks and technical presentations on both days of the workshop, sharing knowledge related to managing and optimizing the Lustre environment that could benefit other users.

The event was organized by the US Department of Defense (DOD)-HPC Research Programme at ORNL, a collaboration between DOD and ORNL. The programme has interests and competencies in extreme-scale HPC, particularly advanced architectures, metrics, benchmarks, system evaluations, programming environments, fully distributed data centres, and parallel file systems. Neena Imam, Mike Brim, and Sarp Oral of ORNL's Computing and Computational Sciences Directorate were the workshop co-chairs.

“Historically, the OLCF has been a leader in deploying the largest known Lustre production file system", stated Mike Brim, a research associate in ORNL's Computer Science and Mathematics Division. "Because of this, we oftentimes run into problems before anyone else. This workshop gave us an opportunity to share the challenges we've overcome and make our solutions available to a wider audience who may be following the same path."

The first day of the programme featured a keynote presentation by Eric Barton, lead architect of the High Performance Data Division at Intel and a long-time proponent of Lustre. On day two, presentations covered technical topics, including burst buffer systems, dynamic file striping, and monitoring toolkits for Lustre.

Jason Hill, the OLCF's HPC Operations storage team leader and tutorial chair for the workshop, led sessions covering networking and the OLCF's efforts to minimize the effects of file system hardware and software failures.

“Lustre has a lot of flexibility in the way you can configure it", Jason Hill stated. “That's one of its great powers, but that's also one of its downfalls. You either have to be an expert in all the areas of the ecosystem that you create or obtain that support from a vendor. The hope is that other members of the Lustre community can benefit from our experience."

A major focus of the workshop concerned adapting Lustre to efficiently handle diverse, non-scientific workloads, such as those produced by Big Data-type applications. ORNL currently is spearheading this initiative.

“Lustre was designed with scientific simulation in mind, which means it's good at sequential read and write I/O workloads", stated Sarp Oral, file and storage systems team lead for the OLCF Technology Integration Group. "Big Data workloads are different, requiring lots of small data reads and randomized access. Lustre is not well suited for these read-heavy, random I/O workloads today. Much of the discussion focused on what could be done to improve Lustre's capabilities in this area."

The first step in diversifying Lustre's I/O workload capabilities is to create tools that measure how the parallel file system currently handles big data workloads, Mike Brim said. "After we've characterized those workloads, we can start talking about what changes are necessary to make Lustre a more general purpose, high-performance parallel file system."

Enhanced workload capability could help expand Lustre's user base, historically a niche market, to include organisations and businesses in a growing number of sectors that are leveraging data mining and analytics tools. Increased capability also could benefit long-time Lustre adherents. For example, a more robust Lustre could give computational scientists improved data analysis capabilities, such as real-time data visualization.

“If we can improve the productivity of analysis workloads on Lustre, we can improve the productivity of scientists by giving them insights more quickly", Mike Brim stated.
Source: Oak Ridge Leadership Computing Facility - OLCF

Back to Table of contents

Primeur weekly 2015-06-01

The Cloud

Registration opens for ISC Cloud & Big Data ...

EMC to acquire Virtustream ...

EuroFlash

Snapwallet: a safe photo service for smart phone ...

Dutch Government plans 135 million euro funding for supercomputers ...

Tackling the fastest and most powerful computing systems on the planet ...

Understanding and controlling the propagation of waves ...

USFlash

Lawrence Livermore breaks ground on unclassified supercomputing facility ...

Silicon Mechanics announces recipients of 4th Annual Research Cluster Grant ...

RAPTOR turbulent combustion code selected for next-gen supercomputer readiness project ...

Cavium announces collaboration with Pegatron on server platforms based on Cavium's ThunderX workload optimized processor family ...

Ohio State University researchers prove magnetism can control heat and sound ...

OLCF names CAAR projects at IDC HPC User Forum ...

Physicists solve quantum tunneling mystery ...

Using the forest to see the trees ...

Tsinghua University crowned champions of 2015 ASC Student Supercomputer Challenge ...

OLCF shares Lustre knowledge at International Workshop ...

National Science Foundation extends the Kraken project ...

Woodside, Australia's largest independent oil and gas company, uses IBM Watson to enhance decision making and increase efficiencies ...

OLCF outreach projects garner awards ...

The first round of 2015 hackathons gets underway ...

Premier announces $21.6 million funding for Pawsey Supercomputing Centre ...

Fujitsu supports King Abdulaziz University research capabilities with new supercomputing system ...

Doctor Evidence brings valuable health data to IBM Watson ecosystem ...

SRC Computers launches Saturn 1 Server, the first reconfigurable hyperscale server ...

BG Brasil and Senai CIMATEC launch Latin America's fastest supercomputer ...