In nature, the resilient lignin polymer helps provide the scaffolding for plants, reinforcing slender cellulosic fibers - the primary raw ingredient of cellulosic ethanol - and serving as a protective barrier against disease and predators. Lignin's protective characteristics persist during biofuel processing, where it's a big hindrance, surviving expensive pre-treatments designed to remove it and blocking enzymes from breaking down cellulose into simple sugars for fermentation into bio-ethanol.
To better understand exactly how lignin persists, researchers at the US Department of Energy's (DOE's) Oak Ridge National Laboratory (ORNL) created one of the largest biomolecular simulations to date - a 23.7-million atom system representing pre-treated biomass (cellulose and lignin) in the presence of enzymes. The size of the simulation required Titan, the flagship supercomputer at the Oak Ridge Leadership Computing Facility (OLCF), a DOE Office of Science User Facility, to track and analyze the interaction of millions of atoms.
The research, led by Jeremy Smith, a Governor's Chair at the University of Tennessee (UT) and director of the UTORNL Center for Molecular Biophysics, revealed in atomistic detail why lignin is such a problem: Not only does it bind to cellulose in the preferred locations sought by enzymes, but lignin also attracts and occupies the cellulose-binding domain of the enzymes themselves.
"That impedes the mechanism the enzyme has to anchor to cellulose. Thus lignin binds exactly where it is least desired for industrial purposes", stated ORNL staff scientist Loukas Petridis. "This detailed knowledge of lignin behaviour can guide genetic engineering of enzymes that bind less to lignin and therefore produce bio-ethanol more efficiently."
Beyond the scientific knowledge obtained from the simulation, the team's biomass system advances computational biophysics' shift toward complex, multi-component systems, a move enabled by leadership-class supercomputers.
During pre-treatment, acid, water, and heat work to remove non-cellulosic biomass from plant material. Lignin, however, sticks around, clustering into aggregates around the cellulose and impeding enzymes from reaching cellulose.
To accurately model this crowded environment, Jeremy Smith's team used experimental data to create a representative sample of pre-treated biomass and enzymes. The model took into account details such as the ratio of cellulose to lignin, type of lignin, and relative amount of enzymes. In total, the simulation tracked nine cellulose fibers, 468 lignin molecules, and 54 enzyme molecules in a rectangular water box.
The team built the model using a molecular dynamics code called GROMACS under an allocation awarded through the Innovative and Novel Computational Impact on Theory and Experiment, or INCITE, programme. With a complete model, the team turned to the Cray XK7 Titan, America's fastest supercomputer, to supply the necessary computing power to observe the system in action.
During its largest runs, the biomass simulation scaled to nearly 4,000 of Titan's 18,666 nodes, producing roughly 45 nanoseconds of simulation time in one day. Over the course of a year, the team amassed 1.3 microseconds of simulation time, a significant length of time in the world of computational biophysics.
"There's nowhere else in the world where we could have run this simulation", Loukas Petridis stated.
In addition to lending insight to the challenges of next-generation biofuels, the team's simulation pointed toward potential pathways that could help mitigate lignin's impact. Specifically, the simulation demonstrated that lignin does not bind as much to less-ordered, or amorphous, cellulose fibers, meaning it competes less with the enzymes there.
"Industrialists knew amorphous cellulose is more easily broken down by enzymes, but what we show is that it's not only the inherent properties of amorphous cellulose that makes it easier for the enzymes but also that lignin is less of a pest", Loukas Petridis stated.
To maximize their time on the OLCF's flagship supercomputer, Jeremy Smith's team tweaked GROMACS to streamline communication across thousands of Titan's CPU cores. Additionally, the team doubled the time interval GROMACS used to calculate the motion of the biomass system. By implementing a more computationally efficient method to track long-range interactions between atoms, the team was able to increase its timestep from 2 femtoseconds to 4 femtoseconds, or 4,000 trillionths of a second, without losing accuracy.
The resulting data was transferred to the OLCF's High-Performance Storage System until it could be analyzed. Typically, analysis is carried out in serial, or one event a time, but growth in computing power and simulation size has created an analysis bottleneck - it just takes too much time.
To get around this constraint, Jeremy Smith's team worked to equip GROMACS with the capability to conduct analysis in parallel, meaning thousands of Titan's processors could work in tandem to carry out analysis tasks. For example, running parallel analyses on 2,000 CPU cores, the researchers could obtain results 2,000 times faster than conventional methods. In collaboration with the ORNL team, Josh Vermaas, a graduate student at the University of Illinois at Urbana-Champaign, contributed significantly to this effort as a DOE Computational Science Graduate Fellow at ORNL.
The new capability not only helped the team reduce its time to solution, but it also paves the way for analyzing similar large-scale simulations in the future. "Analysis was one of the stumbling blocks for simulations at this scale", stated team member Roland Schulz, a UT postdoctoral researcher. "With parallel analysis, it's now more feasible and will make leadership-class simulations easier."
As supercomputers allow for larger and more realistic systems, the ambitions of researchers and the realism of their biological systems continue to rise. Summit, the OLCF's next leadership-class supercomputer, will offer at least five times the computing power of Titan. For Jeremy Smith's team, that means its biomass models have room to grow in complexity to further probe biofuel's challenges.
"We're trying to reach the complexity that is found in nature and industrial conditions", Loukas Petridis stated. "Eventually, we would like to construct a simple model of a plant cell wall that we could process in silico, or via computer simulation, and see how it changes during pre-treatment."
The research was supported by DOE's Office of Science. Josh V. Vermaas, Loukas Petridis, Xianghong Qi, Roland Schulz, Benjamin Linder, and Jeremy C. Smith are the authors of the paper titled "Mechanism of lignin inhibition of enzymatic biomass deconstruction". The paper was published inBiotechnology for Biofuels8, no. 1 (2015): 1.