In the next years, the European Commission will invest several hundred million euro in the set-up of a Cloud infrastructure for the easy exchange of scientific data across disciplines and countries. This will enhance European cooperation in science and provide about 1.7 million scientists in Europe with better conditions and IT services for the transformation of data into knowledge. More than 75 research partners cooperate for this purpose.
KIT's Steinbuch Centre for Computing (ESCC) possesses long-standing experience in the management of big scientific data due to the operation of GridKa for the world's largest particle accelerator, the Large Hadron Collider (LHC), at CERN in Geneva or the coordination of the Helmholtz Data Federation (HDF). Within the HDF, research data of the Helmholtz Association are already being stored similar to what the EOSC is planned to do for entire Europe. "As a reliable partner, we will contribute this experience to the set-up of the EOSC and to the EU projects EOSC-hub and EOSCpilot", Professor Achim Streit, Director of the SCC, stated. In particular, work of KIT will focus on security aspects, such as authentification and authorization in the service infrastructure of the EOSC. "In a federated research Cloud for entire Europe, i.e. a Cloud that will bring together many different, already existing service infrastructures and their users, it must be guaranteed that only those persons and institutions are given access to services and data, which are supposed to have access", Achim Streit stated.
It is these different existing infrastructures that make the envisaged uniform solution a challenge. In the different science disciplines, various cultures prevail, which have to be brought together. To guarantee searchability of data, for instance, the meta data of the stored datasets have to be available in standardized form in all storage systems (repositories).
To support the central data storage and exchange services, an infrastructure will be developed with solutions for the transfer of files containing Big Data volumes or the connection to supercomputers for direct data analysis. In this area, the KIT is responsible for a work package on IT service management. "We have certified experts and more than 15 years of expertise in the development, setup, and operation of federated IT infrastructures and services. We are happy and very proud of the fact that we were asked to coordinate this important work package in the EOSC-hub project", Achim Streit pointed out. This package will also include establishment of an EOSC-wide help desk and ticketing system based on Global Grid User Support (GGUS) as a central point to which user inquiries can be addressed similar to what KIT has been offering for more than a decade for worldwide LHC computing.
Presently, the KIT is contributing its expertise in Big Data management to several infrastructure projects: The Smart Data Innovation Lab SDIL offers a research platform with most modern analysis functions for companies throughout Germany. The Smart Data Solution Center Baden-Württemberg SDSC supports small and medium-sized enterprises that are based in Baden-Württemberg in accessing smart data technologies. The GridKa data centre is part of the worldwide distributed network for the European particle accelerator center CERN. With the Large-Scale Data Facility (LSDF) for science in Baden-Württemberg and the Large-Scale Data Management and Analysis (LSDMA) initiative of the Helmholtz Association, KIT has already established a basis for its role as coordinator of the Helmholtz Data Federation. In addition, KIT informatics institutes study data-intensive computing, algorithm engineering for big data, and data security.