Andre Merzky is a researcher in the Research in Advanced DIstributed Cyberinfrastructure and Applications Laboratory (RADICAL) group at Rutgers University. The group provides workflow and workload execution systems which ensure that certain types of fine and medium granular workloads can effectively and reliably be run on different HPC systems.
Like many, Andre Merzky and his team have recently been pulled into a large scale collaborative effort to investigate drug design for COVID-19.
"Our specific task is to scan very large numbers of chemical compounds for their behavior toward identified COVID-19 receptors", he wrote. "That work is performed in multiple stages: the first stage runs a very quick and rough scan through all 'interesting' compounds, identifying those which show promising binding properties, and subsequent stages perform increasingly fine-grained and thorough analysis of the binding behavior for the ever smaller subsets of compounds."
Andre Merzky's Texascale run focused on the first stage of the above pipeline. They recently received a new database of 120 million compounds which need to be scanned against 50 receptors. A single docking attempt takes between two seconds and five minutes on a single core. Despite being 'trivially parallel', there is a lot of data management and coordination involved in these runs, a task at which RADICAL's tools excel.
Their usual production runs on Frontera explore about six million compounds, use 128 nodes per receptor, and finish in 24 to 60 hours.
"While we did a number of experiments to ensure that we scale to about 1700 nodes, it's still challenging to approach the large data set", Andre Merzky wrote. "Well, that is what we did during our Texascale run. We used just shy of 4000 nodes to scan those 120 million compounds against two receptors."
They managed a complete scan of the database for each receptor in about seven hours. "We will now have to churn through the remaining receptors more slowly - but having one complete scan allows our biochemist to proceed with some analysis much quicker than they could have done otherwise."
Not only that. The large scale run confirmed that their software stack scales as expected. "For the COVID collaborations, it's the first complete scan of that database, and the results are needed to gauge the algorithms of all stages, and to judge the viability of that compound database."
Mahmoud Moradi, a computational chemist at the University of Arkansas, is another researcher who used his Texascale Days compute time to study COVID-19 dynamics.
Mahmoud Moradi has been working on COVID research continuously since April, studying SARS Coronavirus 1 and 2 spike proteins, the causes of the 2003 SARS epidemic and the current COVID pandemic, respectively.
The massive simulation his group ran during Texascale Days used 4000 virtual copies of the coronavirus spike protein to explore the activation pathway of the protein. These copies exchange minimal but very useful information between each other using a statistical mechanics-based scheme, where they inform each other of their positions along the activation pathway.
Each copy of the system uses one node, but the 4000 copies occupy 4000 nodes altogether and the particular communication scheme allows for a highly scalable and efficient way of eventually characterizing the activation pathway of spike protein.
"The simulation was definitely very helpful in finalizing our work on coronavirus spike proteins", he wrote. "We were using a smaller number of copies and nodes before, which would mean a longer time to reach convergence", he explained.
Mahmoud Moradi's results show a meaningful difference between the mechanisms SARS Coronavirus 1 and 2 spike proteins use, which may at least partially explain the dramatically different patterns of the virus' spread.
Efforts by Gabor Toth, Simeon Bird, and Paul Woodward used Frontera to study space - for Toth, space weather; for Bird, the behaviour of black holes; and for Woodward, the death of stars.
"We performed weak scaling study for our brand-new particle-in-cell code FLexible Exascale Kinetic Simulator (FLEKS)", wrote Gabor Toth, a research professor in Climate and Space Sciences and Engineering at the University of Michigan. He found that the code scales well up to at least 28.000 CPU cores with typical workload, and can run on up to 230.000 cores that is half of Frontera.
His team also performed a production simulation with 57.000 cores to study asymmetric magnetic reconnection, which is the most important physical process that controls the interaction between the Earth's magnetosphere and the solar wind. This simulation shows turbulence-like electron flows.
"Without the access to thousands of the CPU nodes, it would take days to run such a large simulation", Gabor Toth stated.
Yueying Ni, graduate student at Carnegie Mellon University, Simeon Bird, professor of physics and astronomy at the University of California, Riverside, and colleagues are using Frontera to run an extremely large simulation of the Universe. It will go from the beginning of time until the era during which star formation in the universe peaked, and it will contain more particles than any other simulation of its type.
"Frontera enabled us to run higher resolution simulations, which let us study physical processes at higher fidelity and make better predictions for future gravitational wave experiments such as the LISA satellite", Simeon Bird wrote. "We used Texascale Days to perform the first major segment of our big run, which will take a few months to complete."
University of Minnesota astrophysicist Paul Woodward's team was given access to the entire Frontera machine (7900 nodes) for 36 hours during Texascale Days. They used this opportunity to perform special, highly resolved simulations of rotating massive main sequence stars. These simulations allow them to compare their results with asteroseismology observations of such stars as a means of checking that their models and simulation technology are accurate.
"The very high grid resolution that Frontera's enormous computing power makes possible enables us for the first time to seriously address questions that concern effects of the convection and rotation on the flow that one might consider to be in some sense second-order small, but are very important over the lifetime of the star", he wrote.
Earth-bound, but still computationally intensive, were tornado-producing thunderstorm simulations by Leigh Orf, an atmospheric scientist with the Space Science and Engineering Center at the University of Wisconsin-Madison.
Leigh Orf used his time to benchmark CM1, the cloud model used in his research, at extreme scale. The experiment allowed him to get a much more accurate sense of how much compute time he would need to complete some of his most ambitious simulations of high-intensity storms and the process by which tornadoes are spawned.
"I was very pleased that CM1 showed excellent strong and weak scaling from 512 to 2048 nodes, and with comparable performance up to 3900 nodes", he recalled. "Now I have timings and scaling information that I can use for the next round of proposals."
P.K. Yeung, whose work focuses on the fundamentals of turbulence, used Frontera to study extreme fluctuations in local concentration gradients.
"Large-scale simulations are important to understand the behavior of extreme fluctuations in the scalar gradients, but often limited by their cost", P.K. Yeung wrote.
Using large portions of Frontera during the Texascale Days event allowed Yeung's group to make faster progress than possible otherwise. "We have developed a new approach where a long simulation at modest small-scale resolution will provide optimal initial conditions for multiple short simulations at very high resolution."
Hari Subramoni, working with D.K. Panda and his team at Ohio State University, took advantage of Texascale Days to optimize and tune the MVAPICH2 MPI library for Frontera. D.K. Panda, a co-principal investigator on the system, leads the MVAPICH2 project, an MPI - or Message Passing Interface - implementation that he hopes will be the fastest way for parallel computing systems to exchange data and compute.
They carried out large-scale experiments, including full-scale system runs, for multiple features of the MVAPICH2 library. This included:
"These optimizations and tuning will be available in the next release of the MVAPICH2 library", stated Hari Subramani. "This will help the Frontera application users extract higher performance and scalability with the MVAPICH2 library."
"I'm very excited by the results of our third Texascale Days event", stated John Cazes, TACC's director of HPC and organizer of the event. "Several of the groups that computed had never run their codes at this scale before. The event showed them what works, what doesn't, what can be learned by running larger problems, and pushed the frontiers of parallelism for many codes. It's a great chance for us and our users to take their codes to extreme scales."