IRIS will link the IRIS UMETRICS dataset containing transaction-level administrative data on sponsored research projects from dozens of the USA's leading higher educational institutions to data from XSEDE allocations. This will result in a new way to examine how access to supercomputers influences the way researchers collaborate with colleagues and the productivity of individuals and research teams.
More than 1100 XSEDE allocation awards can be matched to principal investigators on federal grants in the UMETRICS dataset, according to a preliminary analysis. More than 6300 people are employed on those projects; they are broadly representatives of the wide range of fields of study undertaken by projects using XSEDE.
Once linkages between the datasets are established, a comparative analysis will be done of federally funded research teams that do and do not use XSEDE resources and services. This analysis will examine the effects of XSEDE allocations on the scale and composition of teams, structure of collaboration networks, profile of topics and funders, and research productivity and early impact.
These findings will not only provide a new understanding of the impacts of XSEDE, it will inform evaluation and improvement of cyberinfrastructure programmes nationwide. The resulting data, code and measures will also be included in future IRIS UMETRICS data releases starting this year.
"This project provides a new method to understand, explain and improve large-scale investments such as cyberinfrastructure that enable research across a wide range of academic domains", stated IRIS Executive Director and University of Michigan Professor of Sociology Jason Owen-Smith.
"This study is part of an ongoing effort by XSEDE to find means to articulate the value brought to the community and the impact on the advancement of science across the country through investments such as that made by NSF in XSEDE", stated John Towns, principal investigator of the XSEDE project and Executive Associate Director of Engagement at the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign.
The 2019 release of the IRIS-UMETRICS dataset includes transaction-level information on nearly 400.000 sponsored projects that represent more than $83 billion of direct cost expenditures and employ more than 643.000 people at 31 universities. These data are linked to information on scientific outcomes at multiple levels of analysis. Linkages to federal employment and earnings data are available through the Federal Statistical Research Data Center (FSRDC) system administered by the U.S. Census Bureau. Researchers interested in accessing the data can find more information at the IRIS website. Data are deidentified before being made available for researcher access via the IRIS virtual data enclave.