This spring, more than 50 researchers from across the country visited the Argonne Leadership Computing Facility (ALCF), a U.S. Department of Energy (DOE) Office of Science User Facility, for the ALCF Computational Performance Workshop. The annual training event is designed to help attendees boost code performance and prepare for future ALCF projects through allocation programs, such as the ALCF Data Science Programme (ADSP) and DOE's Innovative and Novel Computational Impact on Theory and Experiment (INCITE) programme.
"Our goal is to connect attendees with the experts who know our systems best and to introduce them to the tools and services that can propel their research forward", stated Yasaman Ghadar, an ALCF assistant computational scientist, who co-organized this year's workshop with Ray Loy, ALCF lead for training, debuggers, and math libraries.
The three-day, on-site workshop provided a venue for users to advance their research by working directly with ALCF computational scientists, performance engineers, data scientists, and visualization experts, as well as invited guests from Intel, ParaTools Inc. (TAU), and Rice University (HPCToolkit). Dedicated access to the ALCF's Theta and Cooley supercomputers allowed the workshop participants to test, debug, and optimize their applications in real time.
"Getting face time with the ALCF team allowed me to more efficiently troubleshoot technical challenges in our research", stated Trevor Rhone, a postdoctoral researcher at Harvard University and co-principal investigator of an ADSP project along with Harvard professor Efthimios Kaxiras.
Trevor Rhone attended the workshop to learn about tools that could help manage the complex workflows and large amounts of data involved in his work, which is focused on studying layered materials and predicting their magnetic and thermodynamic properties to accelerate the discovery of new materials for targeted applications.
Trevor Rhone collaborated with ALCF Data Science team members Murat Keceli and Misha Salim to advance their use of Balsam, an ALCF-developed workflow service, for managing and automating the project's high-throughput density functional theory (DFT) calculations. Their work enabled Trevor Rhone to tackle a new set of magnetic materials that required a complicated multi-step DFT simulation workflow in which Balsam automatically fixed and reinitiated any steps that timed out or failed along the way.
"Using the workshop reservations on Theta, the Balsam setup allowed me to begin and complete 75 percent of the calculations needed for the second phase of my study", Trevor Rhone stated.
In addition to hands-on and breakout tutorial sessions, the event featured talks on a wide range of topics, including debugging and performance profiling tools, I/O optimization, and data and learning frameworks. Attendees also had the opportunity to present a summary of their challenging scientific problems and progress towards scaling their applications and workflows on ALCF resources.
"Being able to hear from and talk with developers and experts was extremely useful and provided me with a faster learning rate and insights into using these tools that I likely wouldn't have learned otherwise", stated Gary Nicholson, a researcher from Missouri University of Science and Technology.
For Gary Nicholson, the workshop presented an opportunity to improve code performance for an INCITE project, led by Missouri S&T professor Lian Duan, that is studying the impact of turbulence on the swept wings of transport aircraft.
Working with Sameer Shende, president of ParaTools Inc., and Argonne computational scientist Marta García Martínez, Gary Nicholson and his colleagues were able to port their code to Theta and use the TAU profiling tool to analyze and optimize its performance.
"I'll be able to take what I learned and use it to track down a bug Ive been having difficulty with", Gary Nicholson stated. "The profiling tools will help us improve the efficiency of our code, so hopefully we can simulate bigger cases or more cases than we normally would be capable of."
Pinaki Pal, a research engineer in Argonne's Energy Systems division, has been using machine learning techniques on ALCF computing resources through a Directors Discretionary allocation to gain insights into complex combustion phenomena in order to optimize and accelerate the design of engines. He attended the workshop to bolster his ongoing research and to prepare for a future ADSP proposal.
"I intend to use some of the techniques I learned at the workshop to develop and streamline deep learning models on Theta in relation to my current projects", Pinaki Pal stated. "The workshop also gave me an opportunity to network with a number of data science experts from ALCF and industry, which will be crucial for future collaborations."
The ALCF will hold another hands-on training event - the annual Simulation, Data, and Learning Workshop - in October. For details on this workshop and other training opportunities, you can stay tuned to the ALCF training webpage.
The ALCF is supported by the DOE Office of Science.