But back then, users' computing needs, and the hardware and software available to meet them, were quite different too. For example, managing data required staffers to move around behind the scenes. When a user filed a request, an operator would retrieve the tape from a rack and load it, then notify the user that the data were available. In some cases, the tapes were stored in a separate building at LLNL and were often picked up by a staffer riding a bicycle. Fortunately, delivery of the Automated Tape Library in 1979 changed this practice by allowing hands-off access.
The software used to manage the data back then offered its own set of challenges, according to Keith Fitzgerald, who first worked at the centre as a contractor helping to maintain the first supercomputers at NERSC - then NMFECC - and went on to lead NERSC's Mass Storage Group.
"When NERSC was first installed, there were no off-the-shelf storage systems, not even back-up systems like are common today", Keith Fitzgerald stated. "Jean Schuler, who still works at LLNL, wrote the original storage system, called PackRat. It was a primitive system, more like a utility that would archive something and be able to give it back, but it had online disks and tapes and back-up."
PackRat quickly evolved into a homegrown system called FILEM that was used for storing codes, data or any information the user wanted saved longer than 24 hours. FILEM allowed either permanent or temporary storage, privacy, the ability to share files with other users and the ability to group files under directories.
Initially all files were stored on disk. A user would make a "write" request from the Control Data Corp. 7600 supercomputer and the FILEM programme would acquire disk space to store the file in, move the data, verify it moved correctly and update the directory so the file could be retrieved later. Files were migrated to tapes from disk as increased user demands led to insufficient disk space.
"One of the first things we did at LLNL was the allocation system", Keith Fitzgerald stated. "Users got allocations for both computing and storage, and they could move the allocation from one to the other as they decided whether they wanted to compute or store results. It was their choice, not management's, and they had to make a conscious decision every time they asked for an allocation."
While FILEM offered some challenges in terms of stability, NERSC relied on it for about a decade, according to Keith Fitzgerald.
"Eventually we looked for a more off-the-shelf solution and converted to the common file system (CFS) storage system that Los Alamos National Laboratory was using", he stated. "It was IBM-based, cost effective, powerful and supported by somebody else."
In the early 1990s, as the amount of data being processed and archived at NERSC continued to grow, NERSC and LLNL were part of the DOE's National Storage Laboratory (NSL) testbed project. NSL was based on a product called UniTree, according to Keith Fitzgerald, and as the project progressed it began running alongside of CFS. It eventually evolved into the High Performance Storage System (HPSS) and migrated with NERSC when the centre moved to Berkeley Lab in 1996. Today HPSS continues to serve as the centre's largest data repository.
HPSS is a hierarchical storage management (HSM) software system that enables all user data to be ingested onto high performance disk arrays and automatically migrated to a very large enterprise tape subsystem for long-term retention. The disk cache in HPSS is designed to retain five days' worth of new data, while the tape subsystem is designed to provide the most cost-effective, long-term, scalable data storage available.
At present, HPSS on tape - including back-up and archive systems - totals over 68 petabytes of data and is growing at about 60 percent annually, according to Jason Hick, who leads NERSC's Storage Systems Group. Other storage resources at NERSC include the NERSC Global Filesystem (NGF), which totals over 13 petabytes on disk and is growing at about 40 percent annually; and local scratch, which totals over 9 petabytes on disk but doesn't grow because it is regularly purged.
Jason Hick spends much of his time thinking about how NERSC will continue to stay abreast of users' data storage needs over the next five to 10 years.
"I think we have the hardware, bandwidth and capacity part of this nailed", Jason Hick stated. "We can project the hardware demands really well. But software is the weakest link. It's not keeping up with user needs and demands, and it never has. And users are pressing us on usability. Looking ahead, that is our biggest challenge: usability of storage."
For most of its first 40 years, NERSC was an exporter of data as scientists ran large-scale simulations and then moved that data to other sites for analysis. But with the growth of experimental data coming from facilities all over the U.S. and other countries, NERSC has now become a net data importer, taking in a petabyte of data each month for storage, analysis and sharing in fields ranging from bioscience and environmental studies to cosmology and high-energy physics (HEP).
While HEP research has historically accounted for the majority of storage needs at NERSC, the last few years has seen a shift toward other research fields, according to Jason Hick.
"In the last five years, the climate community has in some cases surpassed HEP in terms of storage need", he stated. "The climate guys suddenly started taking 50 percent of our storage and this wasn't forecasted well. We were like 'wow, these folks really do have a need'."
That need included helping NERSC's users in the climate community overcome some unique work flow challenges, Jason Hick added.
"With HEP, it was mostly a hardware challenge; they wrote their own programmes and we just needed to provide the hardware", Jason Hick stated. "But the climate guys had a work flow problem. They are a worldwide community and immediately presented the challenge of sharing data across continents and facilities that don't normally like talking to each other because of security and firewalls. Large amounts of data were being exchanged as a matter of routine. That wasn't the case with HEP - they were more about local bandwidth and scale and capacity."
The influx of new experimental facilities - such as the Advanced Light Source and Joint Genome Institute at Berkeley Lab and the Linear Coherent Light Source at SLAC - are also driving this "data deluge".
"With the experimental facilities, not only do they want to ingest data and move it from one place to another, they want to do it in real time", Jason Hick stated. "They are no longer just analyzing a data set, they are using that data in real time to adjust their experiments as they go."
This is where NERSC's expertise comes into play, he added. NERSC provides some of the largest open computing and storage systems available to the global scientific community and continually evolves its systems to ensure that users are never presented with an entirely new system at any one time.
"Instead, we provide constant stewardship and expansion of our systems", Jason Hick stated.
In addition, NERSC has long had a data management policy that helps ensure it is well prepared to meet users anticipated needs as they evolve. And its policy is unique among DOE supercomputing centres.
"We recognize that to delete data or cause a user to remove that data is disruptive", Jason Hick stated. "We are a user facility, so if anyone should be trying to keep up with demand it ought to be us. It's complicated because it drives what we do year to year - how we design our system, spend our budget, make budget recommendations, provisions for storage, the number of devices, the quality of the devices, all kinds of things. And we try to design business practices around the policy to keep up."
For example, NERSC has developed a sponsored storage model that allows people to take advantage of the existing solution and buy into it in order to gain access to larger amounts of storage space. It will be introduced in the upcoming fiscal year. NERSC is also working on new storage allocation options, he added.
"A storage resource is what we are after", he stated. "The idea is to make it 'fungible', to offer flexibility in the types of storage users can have. A user might want disk storage, tape storage, flash storage, scratch storage or some combination thereof. And we are going to allow them to make trade-offs. We will give them a price list and say ok, you've got x amount of 'NERSC dollars' to spend on storage, how do you want to spend it? It's a more capitalistic approach that better helps us design the systems and scale them to what our users need."