Scientists and engineers running application software on DOE supercomputers face formidable challenges getting their codes to run at top performance across a variety of systems and architectures. This forum was launched in 2016 by Berkeley, Lawrence Livermore, Los Alamos, Sandia, Argonne and Oak Ridge national laboratories to develop strategies on how to achieve performance without having to individually tune each application for each specific target platform. The conversation involves developers of applications, software libraries, and frameworks as well as computer language and compiler experts and staff from supercomputer facilities.
What started as a meeting among facility staff in 2016 has become much broader in its scope and goals, noted Doug Doerfler, an HPC architecture and performance engineer in the Advanced Technologies Group at NERSC who co-chaired the meeting and led the programme committee. For example, this year's meeting included a number of talks from invited speakers, and attendees came from a more extensive vendor and international base than in previous years.
"We intentionally expanded our outreach to vendors who aren't necessarily just vendors for our machines", Doug Doerfler stated. "In addition, we had representation from Pacific Northwest and National Renewable Energy national laboratories, the Atomic Weapons Establishment in the UK, and leading researchers from academia in the U.S. and Europe."
The speaker presentations covered a broader swath of topics as well, featuring experts in a number of individual topics, including compilers, programming frameworks, and programming environments. "We had 12 invited speakers, all experts in their respective fields", Doug Doerfler stated. "These are the people who sit on the standards committees and define what's going on."
Several NERSC staff also participated in this year's meeting; Jack Deslippe and Rebecca Hartman-Baker were on the programme committee and each chaired a session, as did Thorsten Kurth. Brian Austin, Chris Daley, Rahul Gayatri, and Kevin Gott gave presentations on topics ranging from "Evaluation of OpenMP performance on GPUs" and "AMReX: Enabling applications on GPUs" to "Does domain-specific hardware spell the end for performance portability?"
"The importance of this meeting is really coming from the fact that, within the DOE, among the large HPC facilities, we are seeing a number of different types of computer architectures being deployed", Jack Deslippe stated. "And it's not uncommon that our users would have accounts on many of these systems, systems even outside the DOE. As a result, our users have consistently told us is that portability is important for them - they don't want to have to write a separate version of their code for each of these systems."
Equally important is longevity, he added; users don't want to have to rewrite their code every few years when something new comes out. "This plays into NERSC's transition to exascale architectures", Jack Deslippe continued. "As we go from Edison - a fairly traditional system, to Cori - a manycore Intel architecture, to Perlmutter - a GPU-accelerated system, to eventually whatever NERSC-10 is going to be it has been really important for us to be able to provide guidance around a portable transition strategy for our users so that they are making improvements to their codes that will pay off for many years to come."
Attendees at the Performance Portability meeting come away with a plethora of best practices recommendations that help guide their development efforts and influences the direction of future architectures and applications.
"For application developers, one of the things they get out of this is recommendations for the best way to portably target different architectures", Jack Deslippe stated. "But then there's another aspect to it, where the people who are developing programming models and libraries and frameworks are learning from each other and pushing these frameworks toward production-quality tools that the community can rely on - things like getting these concepts into the standards and increasing the maturity of the available tools and frameworks that code teams can take advantage of."
"It's an opportunity for them to see how their tools are being used and issues that are coming up", Doug Doerfler added. "It's a venue for the customers and the tool developers within industry and within the DOE to get together and have face-to-face interaction."
Rebecca Hartman-Baker, who leads NERSC's User Engagement Group and has been participating in the performance portability meeting since its inception, noted that her group is especially focused on the usability of HPC systems. "Without this important effort focused on portability,usability across HPC systems could be drastically impaired", she stated. "I enjoy the discussions across vendors and their customer base that enable us all to share perspectives and adopt new ideas."
Participating in these collaborative performance portability efforts has an additional positive effect when it comes to the NERSC Exascale Science Applications Programme, Jack Deslippe noted. "We are learning what teams at other facilities are doing; we are learning some of the new features that are available in these tools, libraries, and frameworks that our community can take advantage of; and hopefully we are also influencing the computer scientists who are putting together these tools that the domain scientists are using", he stated.
For NERSC staff who attended, this year's meeting also served as a way to prepare not just for the Perlmutter/NERSC-9 system due in 2020 but for NERSC-10 as well. "While Perlmutter will have its GPUs and CPUs, with NERSC-10 we anticipate even more heterogeneity in our architecture", Doug Doerfler stated. "So we need to learn as much as we can about all this now, to be ready."