Back to Table of contents

Primeur weekly 2014-02-24

Special

H2020: the long road to an integrated open and accessible European e-Infrastructure ...

PRACE, HPC applications and technological development: three ingredients for a top European strategy ...

Yannick Legré is the new director of EGI.eu ...

The Cloud

Red Hat Enteprise Linux OpenStack Platform leveraged by Alcatel-Lucent, CloudBand as part of its Network Functions Virtualization (NFV) Platform ...

AT&T and IBM join forces to deliver new innovations for the Internet of Things ...

Mellanox introduces CloudX Platform to enable companies to build the most efficient public, private and hybrid Clouds ...

EuroFlash

Powerful supercomputer to offer a glimpse of the early universe ...

From a distance: New technique for repair work ...

ECRIN-ERIC to host inauguration ceremony ...

Karlsruhe Institute of Technology to develop ultra-small and ultra–fast electro-optic modulator ...

SURFsara to host Data & Computing Infrastructure Event on 12-13 March 2014 ...

USFlash

SDSC team develops multi-scale simulation software for chemistry research ...

SDSC/UC San Diego researchers hone in on Alzheimer's disease ...

Intel advances next phase of Big Data intelligence: real time analytics ...

Supercomputer dramatically accelerates rapid genome analysis ...

Using computers to speed up drug discovery ...

Better cache management could improve chip performance and cut energy use ...

A step closer to a photonic future ...

HP delivers record-breaking performance and dramatic efficiencies with HP ProLiant servers ...

Researchers propose a better way to make sense of 'Big Data' ...

Mega-bucks from Russia seed development of 'Big Data' tools ...

A new laser for a faster Internet ...

C-DAC to organize Accelerating Biology 2014: Computing Life ...

NetApp introduces unified scale-out storage systems and virtualization software for the unbound Cloud era ...

Supermicro shipping 96 DIMM 4U 4-Way SuperServer featuring new Intel Xeon processor E7-8800/4800 v2 ...

Mega-bucks from Russia seed development of 'Big Data' tools


Brookhaven National Lab
20 Feb 2014 Upton - The Russian Ministry of Education and Science has awarded a $3.4 million "mega-grant" to Alexei Klimentov, Physics Applications Software Group Leader at the U.S. Department of Energy's Brookhaven National Laboratory, to develop new "Big Data" computing tools for the advancement of science. The project builds on the success of a workload and data management system built by Alexei Klimentov and collaborators to process huge volumes of data from the ATLAS experiment at Europe's Large Hadron Collider (LHC), where the famed Higgs boson - the source of mass for fundamental particles - was discovered. Brookhaven is the lead U.S. laboratory for the ATLAS experiment, and hosts the Tier 1 computing centre for data processing, storage and archiving.

"The increasing capabilities to collect, process, analyze, and extract knowledge from large datasets are pushing the boundaries of many areas of modern science and technology", Alexei Klimentov stated. "This grant recognizes how the computing tools we developed to explore the mysteries of fundamental particles like the Higgs boson can find widespread application in many other fields in and beyond physics. For example, research in nuclear physics, astrophysics, molecular biology, and sociology generates extremely large volumes of data that needs to be accessed by collaborators around the world. Sophisticated computing software can greatly enhance progress in these fields by managing the distribution and processing of such data."

The project will be carried out at Russia's National Research Center Kurchatov Institute (NRC-KI) in Moscow, the lead Russian organization involved in research at the LHC, in collaboration with scientists from ATLAS, other LHC experiments, and other data-intensive research projects in Europe and the U.S. It will make use of computational infrastructure provided by NRC-KI to develop, code, and implement software for a novel "Big Data" management system that has no current analog in science or industry.

Though nothing of this scope currently exists, the new tools will be complementary with a system developed by Brookhaven physicist Torre Wenaus and University of Texas at Arlington physicist Kaushik De for processing ATLAS data. That system, called PanDA - for Production and Distributed Analysis, is used by thousands of physicists around the world in the LHC's ATLAS collaboration.

PanDA links the computing hardware associated with ATLAS - located at 130 computing centres around the world that manage more than 140 petabytes, or 140 million gigabytes, of data - allowing scientists to efficiently analyze the tens of millions of particle collisions taking place at the LHC each day the collider is running. "That data volume is comparable to Google's entire archive on the World Wide Web", Alexei Klimentov stated.

In September 2012, the U.S. Department of Energy's Office of Science awarded $1.7 million to alexei Klimentov/Brookhaven and UT Arlington to develop a version to expand access to scientists in fields beyond high-energy physics and the Worldwide LHC Computing Grid.

For the DOE-sponsored effort, Alexei Klimentov's team is working together with physicists from Argonne National Laboratory, UT Arlington, University of Tennessee at Knoxville, and the Oak Ridge Leadership Computing Facility (OLCF) at Oak Ridge National Laboratory. Part of the team has already set up and tailored PanDA software at the OLCF, pioneering a connection of OLCF supercomputers to ATLAS and the LHC Grid facilities.

"We are now exploring how PanDA might be used for managing computing jobs that run on OLCF's Titan supercomputer to make highly efficient use of Titan's enormous capacity. Using PanDA's ability to intelligently adapt jobs to the resources available could conceivably 'generate' 300 million hours of supercomputing time for PanDA users in 2014 and 2015", he stated.

The main idea is to reuse, as much as possible, existing components of the PanDA system that are already deployed on the LHC Grid for analysis of physics data. "Clearly, the architecture of a specific computing platform will affect incorporation into the system, but it is beneficial to preserve most of the current system components and logic", Alexei Klimentov stated. In one specific instance, PanDA was installed on Amazon's Elastic Compute Cloud, a web-based computing service, and is being used to give Large Synoptic Survey Telescope scientists access to computing resources at Brookhaven Lab and later across the country.

The new system being developed for LHC and nuclear physics experiments, called megaPanDA, will be complementary with PanDA. While PanDA handles data processing the new system will add support for large-scale data handling.

"The challenges posed by 'Big Data' are not limited by the size of the scientific data sets", Alexei Klimentov stated. "Data storage and data management certainly pose serious technical and logistical problems. Arguably, data access poses an equal challenge. Requirements for rapid, near real-time data processing and rapid analysis cycles at globally distributed, heterogeneous data centres place a premium on the efficient use of available computational resources. Our new workload and data management system, mega-PanDA, will efficiently handle both the distribution and processing of data to help address these challenges."

PanDA was created with funding from the Department of Energy's Office of Science and the National Science Foundation.
Source: DOE/Brookhaven National Laboratory

Back to Table of contents

Primeur weekly 2014-02-24

Special

H2020: the long road to an integrated open and accessible European e-Infrastructure ...

PRACE, HPC applications and technological development: three ingredients for a top European strategy ...

Yannick Legré is the new director of EGI.eu ...

The Cloud

Red Hat Enteprise Linux OpenStack Platform leveraged by Alcatel-Lucent, CloudBand as part of its Network Functions Virtualization (NFV) Platform ...

AT&T and IBM join forces to deliver new innovations for the Internet of Things ...

Mellanox introduces CloudX Platform to enable companies to build the most efficient public, private and hybrid Clouds ...

EuroFlash

Powerful supercomputer to offer a glimpse of the early universe ...

From a distance: New technique for repair work ...

ECRIN-ERIC to host inauguration ceremony ...

Karlsruhe Institute of Technology to develop ultra-small and ultra–fast electro-optic modulator ...

SURFsara to host Data & Computing Infrastructure Event on 12-13 March 2014 ...

USFlash

SDSC team develops multi-scale simulation software for chemistry research ...

SDSC/UC San Diego researchers hone in on Alzheimer's disease ...

Intel advances next phase of Big Data intelligence: real time analytics ...

Supercomputer dramatically accelerates rapid genome analysis ...

Using computers to speed up drug discovery ...

Better cache management could improve chip performance and cut energy use ...

A step closer to a photonic future ...

HP delivers record-breaking performance and dramatic efficiencies with HP ProLiant servers ...

Researchers propose a better way to make sense of 'Big Data' ...

Mega-bucks from Russia seed development of 'Big Data' tools ...

A new laser for a faster Internet ...

C-DAC to organize Accelerating Biology 2014: Computing Life ...

NetApp introduces unified scale-out storage systems and virtualization software for the unbound Cloud era ...

Supermicro shipping 96 DIMM 4U 4-Way SuperServer featuring new Intel Xeon processor E7-8800/4800 v2 ...