Back to Table of contents

Primeur weekly 2015-12-07

Quantum computing

A quantum spin on molecular computers ...

Researchers confirm 'realistic' answer to quantum network puzzle ...

Focus on Europe

ISC 2016 introduces PhD Forum and extends deadline for research papers ...

Quantum computer made of standard semiconductor materials ...

Future direction of global research and e-infrastructures landscape in e-IRG/ESFRI organized network session at ICT2015 in Lisbon ...

EGI-UberCloud partnership is bridging research and innovation ...

Finnish ministry issues policy on future development of scientific computing ...

HPC Advisory Council and the ISC High Performance Conference announce university teams for HPCAC-ISC 2016 Student Cluster Competition ...

GEANT agrees new Board, community programme and cost sharing model ...

Spanish Council of Ministers supports the acquisition of MareNostrum 4 ...

NEC's SX-ACE vector supercomputer contributes to research at three German institutes - University of Kiel, Alfred Wegener Institute and the High Performance Computing Center Stuttgart ...

Middleware

TACC's HPC team works to increase technical diversity in supercomputing ...

Hardware

Mellanox promotes Freddy Gabbay to Vice President of Software ...

SanDisk and Supermicro form strategic alliance to deliver software defined all-flash storage solutions ...

Seagate and Newisys demonstrate flash storage architecture capable Of 1TB/s ...

Hewlett Packard Enterprise introduces new class of system to power next era in hybrid infrastructure ...

Applications

Sequoia supercomputer enables Gordon Bell Prize-winning simulation on Earth's mantle ...

Unveiling the turbulent times of a dying star ...

Missing link found between turbulence in collapsing star and hypernova, gamma-ray burst ...

New insights into the creation of heavy elements ...

Manipal Hospitals adopts Watson for Oncology to help physicians identify options for individualized, evidence-based cancer care across India ...

TOP500

Liquid immersion cooling from Green Revolution Cooling helps Tokyo Institute of Technology achieve top honours at Green500 three years in a row ...

The Cloud

Univa announces support for Microsoft Azure ...

Hewlett Packard Enterprise and Microsoft announce plans to deliver integrated hybrid IT infrastructure ...

IBM advances hybrid capabilities to China and unveils Bluemix Local ...

TACC's HPC team works to increase technical diversity in supercomputing


Pictured is the TACC software tools team. L to R: Si Liu, Doug James, James (Jim) Browne, Carlos Rosales, Antonio Gomez, Robert McLay, and Todd Evans.
3 Dec 2015 Austin - Today, the landscape of high performance computing (HPC) is much different than that of 15 years ago. In addition to scientists who use advanced computing systems, there has been a recent influx of researchers, including students from previously under-represented disciplines as varied as the humanities, economics, and social sciences, who are learning to take advantage of HPC for their varying research needs.

Those engaged in computational research often reach the point where they outgrow their personal desktops or department computers - programmes run slowly, run out of memory, or researchers don't have all the sophisticated software needed to run simulations, and they turn to advanced computing resources like those at the Texas Advanced Computing Center (TACC). But those new to supercomputers must learn to make the sometimes difficult transition from desktop to supercomputer and begin sharing resources with hundreds or even thousands of other researchers.

Doug James, HPC research associate at TACC, compares this transition to his experience as a cyclist.

"When I started riding, I bought a Sears-Roebuck bike and it met my needs for a while, but I eventually got to a point where I knew I was outgrowing the bike. My skills, my needs, my fitness level, had reached the point where it was pretty obvious I needed a more sophisticated bike", he stated.

"The problem we're trying to solve is in this transition from desktop to supercomputer. Those who support computational research are working hard to put mechanisms in place to make it possible for thousands and thousands of people to share these resources and get their work done", Doug James continued.

To ease this transition, TACC's HPC research team act as detectives to understand the issues that users face and help them gain control of their computing environment. By developing software tools, they make the process of using supercomputing more efficient for users, consultants and administrators.

In 2009, Robert McLay, a TACC research associate and manager of HPC software tools, noticed a recurring issue that always landed on his desk. For researchers to successfully run calculations on supercomputers, they must use the appropriate package of applications, file libraries, and compilers, which is challenging without deep knowledge of these concepts. So Robert McLay developed Lmod, a software tool that takes into account the unique environment that each researcher requires. It helps researchers select the correct combination among thousands of possible options and prevents them from loading incompatible software.

As one of TACC's more established tools, Lmod use grew steadily over the years, and is used by supercomputing centers around the world. Lmod also spurred a new generation of tools to better address the particular needs of every user. The XALT tool is a collaborative effort between Mark Fahey, an HPC researcher at the University of Chicago and Robert McLay. XALT helps consultants by generating detailed information on the software that researchers use and how successful they are in using that software.

"XALT is designed to be a census taker, so we know what programs get used and at what rate. We also know all the other kinds of applications we do and what libraries get used so we can better manage our system", Robert McLay stated.

XALT is complementary to TACC Stats, a tool to help consultants respond to user questions about jobs, or the process of running executable codes or programmes on supercomputers. "We look at individual jobs. I can see things like who ran it, when they ran it, what queues they ran it in, if it completed successfully, how many nodes, and I can see what hosts or what nodes it ran on", Todd Evans, a research associate on the HPC team stated.

TACC Stats generates performance metrics for individual jobs and also helps the HPC team address user needs by analyzing general trends in jobs. The tool generates performance snapshots by taking measurements at the beginning, every 10 minutes, and at the end of a job on system statistics and hardware performance computer data.

One tool that empowers users to better understand why a job might fail is the Sanitytool, developed by Robert McLay and Si Liu, research associate on the HPC team. Error messages can be ambiguous and confusing. The Sanitytool is written in Python and by typing a simple command, a user can invoke the tool which runs a series of customized tests to determine exactly what went wrong.

Si Liu cited one example where a TACC consultant worked with a user and spent several hours attempting to figure out why a job did not run correctly. However, after using the Sanitytool, it only took a few minutes to diagnose the issue and resolve it.

Designed by James Browne, Professor Emeritus of computer science at the University of Texas at Austin, the PerfExpert tool makes performance optimization at the compute node level as simple as possible. The tool is designed to help researchers easily detect the main issues in their codes and to improve the performance of their programs without requiring computing expertise. PerfExpert automates instrumentation and profiling, analysis of bottlenecks, and recommendation of optimizations.

Remora is TACC's newest tool, which stemmed from a common request from users to understand how much memory a job is using since jobs can crash if they run out of memory. The tool also allows consultants to visualize how memory evolves during execution of a job through graphs.

"One user ran a job that was creating 20,000 requests per second in the file system so the application ran very slowly and could potentially create issues on the file system", Antonio Gomez, another research associate in the HPC group, stated. "Using Remora, we showed him the issue and he changed his programme so that he was only accessing files 600 times a second. The performance of the code improved by 25% to 30% thanks to the insights provided by this tool."

TACC's HPC team is dedicated to educating researchers and HPC professionals on the importance of software tools to broaden the scope of the community and lower the barrier to HPC access. Each of these performance monitoring tools are open-source and available on TACC's github page. Recently, the team released their paper, "Tales from the Trenches: Can User Support Tools Make a Difference", and presented findings at the second annual workshop on HPC user support tools.

"Software tools benefit the humanities researchers, they benefit the people trying to use supercomputers who are coming from colleges that may not have done much computational research in the past", Doug James stated. "There's a sense in which our passion is helping the nontraditional and underrepresented disciplines, demographics, universities and colleges make this transition."

Source: University of Texas at Austin - Texas Advanced Computing Center - TACC

Back to Table of contents

Primeur weekly 2015-12-07

Quantum computing

A quantum spin on molecular computers ...

Researchers confirm 'realistic' answer to quantum network puzzle ...

Focus on Europe

ISC 2016 introduces PhD Forum and extends deadline for research papers ...

Quantum computer made of standard semiconductor materials ...

Future direction of global research and e-infrastructures landscape in e-IRG/ESFRI organized network session at ICT2015 in Lisbon ...

EGI-UberCloud partnership is bridging research and innovation ...

Finnish ministry issues policy on future development of scientific computing ...

HPC Advisory Council and the ISC High Performance Conference announce university teams for HPCAC-ISC 2016 Student Cluster Competition ...

GEANT agrees new Board, community programme and cost sharing model ...

Spanish Council of Ministers supports the acquisition of MareNostrum 4 ...

NEC's SX-ACE vector supercomputer contributes to research at three German institutes - University of Kiel, Alfred Wegener Institute and the High Performance Computing Center Stuttgart ...

Middleware

TACC's HPC team works to increase technical diversity in supercomputing ...

Hardware

Mellanox promotes Freddy Gabbay to Vice President of Software ...

SanDisk and Supermicro form strategic alliance to deliver software defined all-flash storage solutions ...

Seagate and Newisys demonstrate flash storage architecture capable Of 1TB/s ...

Hewlett Packard Enterprise introduces new class of system to power next era in hybrid infrastructure ...

Applications

Sequoia supercomputer enables Gordon Bell Prize-winning simulation on Earth's mantle ...

Unveiling the turbulent times of a dying star ...

Missing link found between turbulence in collapsing star and hypernova, gamma-ray burst ...

New insights into the creation of heavy elements ...

Manipal Hospitals adopts Watson for Oncology to help physicians identify options for individualized, evidence-based cancer care across India ...

TOP500

Liquid immersion cooling from Green Revolution Cooling helps Tokyo Institute of Technology achieve top honours at Green500 three years in a row ...

The Cloud

Univa announces support for Microsoft Azure ...

Hewlett Packard Enterprise and Microsoft announce plans to deliver integrated hybrid IT infrastructure ...

IBM advances hybrid capabilities to China and unveils Bluemix Local ...