Conducted by DDN for the fourth consecutive year, the survey polled a cross-section of 143 end-users managing data intensive infrastructures worldwide and representing hundreds of petabytes of storage investment. Respondents included individuals responsible for high performance computing and also networking and storage systems from financial services, government, higher education, life sciences, manufacturing, national labs, and oil and gas organisations. The volume of data under management in each of these organisations is staggering and steadily increasing each year. Of organisations surveyed:
Data and data storage remain the most strategic part of the HPC data centre, according to an overwhelming majority of survey respondents (77 percent), as end users seek to solve data access, workflow and analytics challenges to accelerate time to results.
Survey respondents revealed the rising use of private and hybrid Clouds within HPC data centres. Respondents planning to leverage a Cloud for at least part of their data in 2017 rose to 37 percent, up almost 10 percentage points year-over-year. Of those, more than 80 percent are choosing private or hybrid Clouds versus a public Cloud option. "These responses are consistent with the trends DDN observes in customer conversations, the most prevalent of which is organizations rebounding from public Cloud due to cost, poor latency and sheer data immobility issues", stated Laura Shepard, senior director of marketing, DDN.
Use of flash in HPC data centres has intensified with more than 90 percent of respondents using flash storage at some level within their data centres today. Perhaps surprisingly, while all-flash arrays are perceived by many to be the fastest storage available in the market, only 10 percent of surveyed users from these most data-intense environments are using an all-flash array. The vast majority of respondents (80 percent) are using hybrid flash arrays either as an extension to storage-level cache, to accelerate metadata, or to accelerate data sets associated with key or problem I/O applications.
A diverse set of applications and an upsurge in site-wide file systems, data lakes and active archives are driving fast-paced data growth in large-scale environments and analytical workflows, which are placing rigorous demands on storage infrastructures and creating unique challenges for HPC users. Performance ranks as the number one storage and Big Data challenge by strong majority (76 percent) of those polled; and mixed I/O performance was cited as the biggest concern by a strong majority (61 percent) of the respondents, which represented an eight percentage point increase compared with last year's survey results.
An even higher portion of respondents (68 percent) identify storage I/O as the main bottleneck specifically in analytics workflows. As these responses demonstrate, performance issues have escalated as Big Data environments contend with a proliferation of diverse applications creating mixed I/O patterns and stifling the performance of their storage infrastructure.
Only a small and diminishing percentage of respondents believe today's file systems and data management technologies will be able to scale to Exascale levels, while almost 75 percent of respondents believe new innovation will be required. This belief is illustrated in respondents' views on addressing performance issues:
As an increasing number of HPC sites move to real-world implementation of multi-site HPC collaboration, concerns about security, privacy, and data sharing have intensified significantly. A strong majority of those surveyed (70 percent) view security and data-sharing complexity as the biggest impediments to multi-site collaborations.
With storage performance a critical requirement for today's large-scale and petascale-level data centres, site-wide file systems continue to be a significant infrastructure trend in HPC environments, according to 72 percent of HPC customers polled. Site-wide file systems allow architects either to consolidate data storage multiple compute clusters on the same storage platform and/or to provide the flexibility to upgrade storage and servers independently as needed. In addition to some of the largest supercomputing sites like those at Oakridge National Lab (ORNL), National Energy Research Scientific Computing Center (NERSC) and Texas Advanced Computing Center (TACC), site-wide file systems are expanding into more mid-sized data centers with a smaller number or smaller sized compute clusters.
"The results of DDN's annual HPC Trends Survey reflect very accurately what HPC end users tell us and what we are seeing in their data centre infrastructures. The use of private and hybrid Clouds continues to grow although most HPC organisations are not storing as large a percentage of their data in public Clouds as they anticipated even a year ago. Performance remains the top challenge, especially when handling mixed I/O workloads and resolving I/O bottlenecks. Given this, it's not surprising that 90 percent of those surveyed are using flash within their data centres, but what is notable is that the more storage experience a site has, the more likely they are to use flash to accelerate multiple tiers of storage rather than putting it all in one tier for one part of the workflow", added Laura Shepard. "Survey respondents also reaffirmed that data storage is the most strategic part of the data centre."