Effective Data Practices: new recommendations to support an open research ecosystem

Printer-friendly version

Association of Research Libraries (ARL) is pleased to announce the release of a new report written with our partners at the Califronia Ditial Library (CDL), the Association of American Universities (AAU), and the Association of Public and Land-grant Universities (APLU): Implementing Effective Data Practices: Stakeholder Recommendations for Collaborative Research Support.Effective Data Practices

The report brings together information and insights shared during a December 2019 National Science Foundation (NSF)-sponsored invitational conference on implementing effective data practices. In this report, experts from library, research, and scientific communities provide key recommendations for effective data practices to support a more open research ecosystem.

During the December conference, the project team developed a set of recommendations for the broad adoption and implementation of NSF’s recommended data practices as described in the NSF’s May 2019 Dear Colleague Letter. The report focuses on recommendations for research institutions and also provides guidance for publishers, tool builders, and professional associations. The AAU-APLU Institutional Guide to Accelerating Public Access to Research Data, forthcoming in spring 2021, will include the recommendations.

The conference focused on designing guidelines for (1) using persistent identifiers (PIDs) for datasets, and (2) creating machine-readable data management plans (DMPs), both data practices that were recommended by NSF. Based on the information and insights shared during the conference, the project team developed a set of recommendations for the broad adoption and implementation of NSF’s preferred data practices.

The report focuses on recommendations for research institutions and also provides guidance for publishers, tool builders, and professional associations. The AAU-APLU Institutional Guide to Accelerating Public Access to Research Data, forthcoming in spring 2021, will include the recommendations.

Five key takeaways from the report are:

  • Center the researcher by providing tools, education, and services that are built around data management practices that accommodate the scholarly workflow.
  • Create closer integration of library and scientific communities, including researchers, institutional offices of research, research computing, and disciplinary repositories.
  • Provide sustaining support for the open PID infrastructure that is a core community asset and essential piece of scholarly infrastructure. Beyond adoption and use of PIDs, organizations that sustain identifier registries need the support of the research community.
  • Unbundle the DMP, because the DMP as currently understood may be overloaded with too many expectations (for example, simultaneously a tool within the lab, among campus resource units, and with repositories and funding agencies). Unbundling may allow for different parts of a DMP to serve distinct and specific purposes.
  • Unlock discovery by connecting PIDs across repositories to assemble diverse data to answer new questions, advance scholarship, and accelerate adoption by researchers.

The report also identifies five core PIDs that are fundamental and foundational to an open data ecosystem. Using these PIDs will ensure that basic metadata about research is standardized, networked, and discoverable in scholarly infrastructure:

  • Digital object identifiers (DOIs) from DataCite to identify research data, as well as from Crossref to identify publications
  • Open Researcher and Contributor (ORCID) IDs to identify researchers
  • Research Organization Registry (ROR) IDs to identify research organization affiliations
  • Crossref Funder Registry IDs to identifier research funders
  • Crossref Grant IDs to identify grants and other types of research awards

The report is intended to encourage collaboration and conversation among a wide range of stakeholder groups in the research enterprise by showcasing how collaborative processes help with implementing PIDs and machine-actionable DMPs (maDMPs) in ways that can advance public access to research.

The full report is now available online.

This material is based upon work supported by the National Science Foundation under Grant Number 1945938. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Project team:
John Chodacki, California Digital Library
Cynthia Hudson-Vitale, Pennsylvania State University
Natalie Meyers, University of Notre Dame
Jennifer Muilenburg, University of Washington
Maria Praetzellis, California Digital Library
Kacy Redd, Association of Public and Land-grant Universities
Judy Ruttenberg, Association of Research Libraries
Katie Steen, Association of American Universities

Additional report and conference contributors:
Joel Cutcher-Gershenfeld, Brandeis University
Maria Gould, California Digital Library