As input to the report, NERSC used feedback from the Exascale Requirement Reviews, a set of workshops held from 2015-2017 in collaboration with the Oak Ridge and Argonne Leadership Computing Facilities and ESnet. In these workshops, scientists from each Office of Science programme office were asked to describe their scientific grand challenges and computational and data requirements. The workshops identified massive data rate increases from various detectors and sensors and the need for analysis, management, archiving and curation capabilities beyond what is common today. The reports that came out of these reviews also emphasized the growing complexity of scientific workflows from experimental facilities and the need to accommodate them on high performance computing (HPC) systems.
Storage systems play a critical role in supporting NERSC's mission by enabling the retention and dissemination of science data used and produced at the centre. Over the past 10 years, the total volume of data stored at NERSC has increased from 3.5 PiB to 146 PiB, growing at an annual rate of 30%, driven by a 1000x increase in system performance and 100x increase in system memory. In addition, there has been dramatic growth in experimental and observational data, and experimental facilities are increasingly turning to NERSC to meet their data analysis and storage requirements.
At the same time, the technologies underpinning traditional storage in HPC are rapidly changing. Solid-state drives are being integrated into HPC systems as a new tier of high-performance storage, shifting the role of magnetic disk media away from performance, and tape revenues are slowly declining. Economic drivers coming from Cloud and hyperscale data centre providers are altering the mass storage ecosystem as well, rapidly advancing the state of the art in object-based storage systems over POSIX-based parallel file systems. In addition, non-volatile storage-class memory is emerging as a high-performance, low-latency media for storage. The combination of these factors broadens the design space of future storage systems, creating new opportunities for innovation but also introducing new uncertainties.
"The future of storage in HPC is getting complicated, and NERSC has published a vision for how the new and emerging elements can be most effectively utilized in the next 10 years", stated Damian Hazen, group lead of the Storage Systems Group at NERSC. "Our goal was to provide a NERSC roadmap for storage through 2025 that will ensure users can make optimal use of future storage technologies and that those storage technologies will continue to meet the needs of the DOE Office of Science user community."
Using the requirements reviews and a detailed workload analysis, NERSC identified four data storage categories required by the user community: temporary, campaign, forever and community. In parallel, NERSC conducted storage technology deep dives and held discussions and presentations with staff at other HPC facilities to determine how these four categories can map to physical storage systems. The roadmap sets a target of implementing three tiers by 2020 and two tiers by 2025, ultimately combining different types of storage media to simplify data management for users, noted Glenn Lockwood, an HPC performance engineer at NERSC and a contributing author of the report. According to the report, the performance and scalability requirements of future systems will drive the industry toward object stores by 2025, and HPC centres such as NERSC will rely on middleware to provide familiar interfaces like POSIX and HDF5 for users who aren't ready to change the way they perform I/O.
"With this roadmap and long-term strategy, we identify areas where NERSC is positioned to provide leadership in storage in the coming decade to ensure our users are able to make the most productive use of all relevant storage technologies", stated NERSC Division Director Sudip Dosanjh.
Because of the diversity of NERSC user workloads across scientific domains, this analysis and the reference storage architecture should be relevant to HPC storage planning outside of NERSC and the DOE, the report notes.