At the end of last year, we were excited to announce our renewed commitment to community and the launch of three cross-functional programs to guide and accelerate our work. We introduced this new approach to work towards better cross-team alignment, shared responsibility, improved communication and learning, and make more progress on the things members need.
This year, metadata development is one of our key priorities and we’re making a start with the release of version 5.4.0 of our input schema with some long-awaited changes. This is the first in what will be a series of metadata schema updates.
What is in this update?
Publication typing for citations
This is fairly simple; we’ve added a ‘type’ attribute to the citations members supply. This means you can identify a journal article citation as a journal article, but more importantly, you can identify a dataset, software, blog post, or other citation that may not have an identifier assigned to it. This makes it easier for the many thousands of metadata users to connect these citations to identifiers. We know many publishers, particularly journal publishers, do collect this information already and will consider making this change to deposit citation types with their records.
Every year we release metadata for the full corpus of records registered with us, which can be downloaded for free in a single compressed file. This is one way in which we fulfil our mission to make metadata freely and widely available. By including the metadata of over 165 million research outputs from over 20,000 members worldwide and making them available in a standard format, we streamline access to metadata about scholarly objects such as journal articles, books, conference papers, preprints, research grants, standards, datasets, reports, blogs, and more.
Today, we’re delighted to let you know that Crossref members can now use ROR IDs to identify funders in any place where you currently use Funder IDs in your metadata. Funder IDs remain available, but this change allows publishers, service providers, and funders to streamline workflows and introduce efficiencies by using a single open identifier for both researcher affiliations and funding organizations.
As you probably know, the Research Organization Registry (ROR) is a global, community-led, carefully curated registry of open persistent identifiers for research organisations, including funding organisations. It’s a joint initiative led by the California Digital Library, Datacite and Crossref launched in 2019 that fulfills the long-standing need for an open organisation identifier.
STM, DataCite, and Crossref are pleased to announce an updated joint statement on research data.
In 2012, DataCite and STM drafted an initial joint statement on the linkability and citability of research data. With nearly 10 million data citations tracked, thousands of repositories adopting data citation best practices, thousands of journals adopting data policies, data availability statements and establishing persistent links between articles and datasets, and the introduction of data policies by an increasing number of funders, there has been significant progress since. It now seems appropriate to focus on providing updated recommendations for the various stakeholders involved in research data sharing.
The premise of the original joint statement still stands: most stakeholders across the spectrum of researchers, funders, librarians and publishers agree about the benefits of making research data available and findable for reuse by others. This improves utility and rigor of the scholarly record. Still, research data sharing is not yet a self-evident step in the research lifecycle. We now have sufficient scholarly communication infrastructure in place to bring about widespread change and believe momentum is building for collective action.
It is in this context that DataCite, a global membership community working with over 2800 repositories around the world, and STM, whose membership consists of over 140 scientific, technical, and medical publishing organizations, are issuing this joint statement. Crossref, a nonprofit open infrastructure with over 18,000 institutional members from 150 countries, joins this call, recognising the need for an amplified focus on data citation. The aim of this statement is to accelerate adoption of best practices and policies, and encourage further development of critical policies in collaboration with a wide group of stakeholders.
Signatories of this statement recommend the following as best practice in research data sharing:
When publishing their results, researchers deposit related research data and outputs in a trustworthy data repository that assigns persistent identifiers (DOIs where available). Researchers link to research data using persistent identifiers.
When using research data created by others, researchers provide attribution by citing the datasets in the reference section using persistent identifiers.
Data repositories enable sharing of research outputs in a FAIR way, including support for metadata quality and completeness.
Publishers set appropriate journal data policies, describing the way in which data is to be shared alongside the published article.
Publishers set instructions for authors to include Data Citations with persistent identifiers in the references section of articles.
Publishers include Data Citations and links to data in Data Availability Statements with persistent identifiers (DOIs where available) in the article metadata registered with Crossref.
In addition to Data Citations, Data Availability Statements (human- and machine-readable) are included in published articles where appropriate.
Repositories and publishers connect articles and datasets through persistent identifier connections in the metadata and reference lists.
Funders and research organizations provide researchers with guidance on open science practices, track compliance with open science policies where possible, and promote and incentivize researchers to openly share, cite and link research data.
Funders, policymaking institutions, publishers and research organizations collaborate towards aligning FAIR research data policies and guidelines.
All stakeholders collaborate in the development of tools, processes, and incentives throughout the research cycle to enable sharing of high-quality research data, making all steps in the process clear, easy and efficient for researchers by providing support and guidance.
Stakeholders responsible for research assessment take into account data sharing and data citation in their reward and recognition system structures.
We, the following signatories shall adopt and promote the relevant best practices laid out above. We hope that our action inspires the community, including researchers, research funders, research institutions, data repositories and publishers, to join us in making it easy for researchers to share, link and cite research data.