As to the definition of a capable exascale computer, Paul Messina said that it can be described as a supercomputer that can solve science problems fifty times faster than the 20 PF systems of today in a power envelope of 20-30 MW. The machine should be sufficiently resilient so that the user intervention due to hardware or system faults is on the order of a week.
As to the DOE role in NSCI, Paul Messina explained that starting this year the Exascale Computing Project (ECP) is initiated as a DOE-SC/NNSA-ASC partnership using DOE's formal project management processes. The ECP is a ten-year project led by DOE laboratories and executed in collaboration with academia and industry. The ECP leadership team has staff from six US DOE labs, as well as staff from most of the 17 DOE national laboratories that will take part in it. The ECP will collaborate with the facilities that operate DOE's most powerful computers.
The goal is to develop a broad set of modelling and simulation applications that meet the requirements of the scientific, engineering, and nuclear security programmes of the DOE and National Nuclear Security Administration (NNSA). Another major goal is to develop a productive exascale capability in the US by 2023, including the required software and hardware technologies.
Paul Messina stressed that integration and co-design are key. Capable exascale computing requires close coupling and coordination of key development and technology R&D areas. Application development, software technology, hardware technology, and exascale systems are the four components.
There is an ECP Laboratory Team that has been established for this broad exascale application impact. Therefore, 200 white papers have been submitted. Applications development teams will be funded to deploy applications development activities, each aiming at capability and specific challenge problems, following software engineering practices. They will be tasked to provide software and hardware requirements and execute milestones jointly with software activities. The intention is to establish co-design centres for commonly used methods, e.g. adaptive mesh refinement, particle-in-cell, and others. Developer training will also be taken care of.
Paul Messina also insisted on the central role of software. A conceptual ECP software stack will be developed that will involve correctness, visualization, data analysis, applications and co-design. Programming models will be developed in a development environment, that will take into account runtime, math libraries and frameworks, and tools. The programme will also involve system software, resource management, threading, scheduling, monitoring and control. Elements of importance are Node OS, low-level runtime, memory and Burst Buffer, data management, I/O, and the file system.
Hardware technology activities will take place in the PathForward initiative to support the DOE-vendor collaborative R&D activities required to develop exascale systems with at least two diverse architectural features, as Paul Messina quoted from the RFP.
The ECP will roll out in three phases. From 2016 to 2019, applications will be developed; R&D&D will be conducted on software technologies and vendor R&D on node and system designs that are better suited for HPC applications. In 2019 the ECP insights will be used in formulation of the RFP to exascale systems. In 2020-2023, implementation will be going on.
Detailed plans have been developed and proposals have already been solicited, concluded Paul Messina.