This year, metadata development is one of our key priorities and we’re making a start with the release of version 5.4.0 of our input schema with some long-awaited changes. This is the first in what will be a series of metadata schema updates.
What is in this update?
Publication typing for citations
This is fairly simple; we’ve added a ‘type’ attribute to the citations members supply. This means you can identify a journal article citation as a journal article, but more importantly, you can identify a dataset, software, blog post, or other citation that may not have an identifier assigned to it. This makes it easier for the many thousands of metadata users to connect these citations to identifiers. We know many publishers, particularly journal publishers, do collect this information already and will consider making this change to deposit citation types with their records.
Every year we release metadata for the full corpus of records registered with us, which can be downloaded for free in a single compressed file. This is one way in which we fulfil our mission to make metadata freely and widely available. By including the metadata of over 165 million research outputs from over 20,000 members worldwide and making them available in a standard format, we streamline access to metadata about scholarly objects such as journal articles, books, conference papers, preprints, research grants, standards, datasets, reports, blogs, and more.
Today, we’re delighted to let you know that Crossref members can now use ROR IDs to identify funders in any place where you currently use Funder IDs in your metadata. Funder IDs remain available, but this change allows publishers, service providers, and funders to streamline workflows and introduce efficiencies by using a single open identifier for both researcher affiliations and funding organizations.
As you probably know, the Research Organization Registry (ROR) is a global, community-led, carefully curated registry of open persistent identifiers for research organisations, including funding organisations. It’s a joint initiative led by the California Digital Library, Datacite and Crossref launched in 2019 that fulfills the long-standing need for an open organisation identifier.
We began our Global Equitable Membership (GEM) Program to provide greater membership equitability and accessibility to organizations in the world’s least economically advantaged countries. Eligibility for the program is based on a member’s country; our list of countries is predominantly based on the International Development Association (IDA). Eligible members pay no membership or content registration fees. The list undergoes periodic reviews, as countries may be added or removed over time as economic situations change.
Retractions and corrections from Retraction Watch are now available in Crossref’s REST API. Back in September 2023, we announced the acquisition of the Retraction Watch database with an ongoing shared service. Since then, they have sent us regular updates, which are publicly available as a csv file. Our aim has always been to better integrate these retractions with our existing metadata, and today we’ve met that goal.
This is the first time we have supplemented our metadata with a third-party data source. Until now, our APIs have included metadata provided by Crossref members along with outputs from our internal enrichment workflows, such as matches found for bibliographic reference matching and funders. Third party metadata has been gathered in Event Data, but this has been stored and delivered separately.
Knowing when work has been retracted is critical for assessing the integrity of research, and this enhancement of the data will be a great benefit to the community.
Where does the data come from?
Retraction Watch carefully curates retractions, pulling them from several non-Crossref sources, including PubMed and publisher websites. Each entry is manually checked and annotated before being added to the database. The high level of curation and broad coverage is what made a partnership between Crossref and Retraction Watch attractive, and our shared goal of making changes to metadata more visible.
“Our goal with the Retraction Watch Database has always been for it to be as useful to as many people as possible, and available from as many sources as possible,” says Ivan Oransky, co-founder of Retraction Watch and executive director of The Center For Scientific Integrity, its parent nonprofit organization. “Integration with Crossref’s REST API is a huge step in that direction.”
Where can I see the retractions?
If you use a service that collects Crossref metadata, you will start to see the Retraction Watch retractions as they are picked up. To access the data directly, you can find retractions from both Crossref members and Retraction Watch in our REST API, for example with the following request for all retractions:
The source field states where the retraction came from. Currently, it can have two values: publisher or retraction-watch. Note that the same retraction may be included multiple times from different sources.
Retraction Watch retractions will remain available on Gitlab in csv format and be updated on working days. The record-id refers to the entry in the csv file with further details, such as the reason for retraction.
Like the rest of our metadata, the retractions are freely available. If you use or operate a tool that ingests retractions, the new entries will start to be picked up immediately. The Retraction Watch database includes a larger number of retractions than the Crossref database, so you should see an increase in the total.
We have heard from organisations that would like to build new research integrity tools based on this data. We look forward to seeing the benefits brought by wider availability of the Retraction Watch retractions, and how they can provide better context to research outputs.
While Crossref metadata is freely available to reuse without a license, if you make use of the Retraction Watch retraction metadata in a published work, we kindly request that you provide a citation to the source.
If you have questions or comments, please head over to the section of our forum dedicated to integrity of the scholarly record.