This year, metadata development is one of our key priorities and we’re making a start with the release of version 5.4.0 of our input schema with some long-awaited changes. This is the first in what will be a series of metadata schema updates.
What is in this update?
Publication typing for citations
This is fairly simple; we’ve added a ‘type’ attribute to the citations members supply. This means you can identify a journal article citation as a journal article, but more importantly, you can identify a dataset, software, blog post, or other citation that may not have an identifier assigned to it. This makes it easier for the many thousands of metadata users to connect these citations to identifiers. We know many publishers, particularly journal publishers, do collect this information already and will consider making this change to deposit citation types with their records.
Every year we release metadata for the full corpus of records registered with us, which can be downloaded for free in a single compressed file. This is one way in which we fulfil our mission to make metadata freely and widely available. By including the metadata of over 165 million research outputs from over 20,000 members worldwide and making them available in a standard format, we streamline access to metadata about scholarly objects such as journal articles, books, conference papers, preprints, research grants, standards, datasets, reports, blogs, and more.
Today, we’re delighted to let you know that Crossref members can now use ROR IDs to identify funders in any place where you currently use Funder IDs in your metadata. Funder IDs remain available, but this change allows publishers, service providers, and funders to streamline workflows and introduce efficiencies by using a single open identifier for both researcher affiliations and funding organizations.
As you probably know, the Research Organization Registry (ROR) is a global, community-led, carefully curated registry of open persistent identifiers for research organisations, including funding organisations. It’s a joint initiative led by the California Digital Library, Datacite and Crossref launched in 2019 that fulfills the long-standing need for an open organisation identifier.
We began our Global Equitable Membership (GEM) Program to provide greater membership equitability and accessibility to organizations in the world’s least economically advantaged countries. Eligibility for the program is based on a member’s country; our list of countries is predominantly based on the International Development Association (IDA). Eligible members pay no membership or content registration fees. The list undergoes periodic reviews, as countries may be added or removed over time as economic situations change.
If you manage a publishing system or workflow, you know how crucial—and how challenging!—it is to have clean, consistent, and comprehensive affiliation metadata. Author affiliations, and the ability to link them to publications and other scholarly outputs, are vital for numerous stakeholders across the research landscape. Institutions need to monitor and measure their research output by the articles their researchers have published. Funders need to be able to discover and track the research and researchers they have supported. Academic librarians need to easily find all of the publications associated with their campus. Journals need to know where authors are affiliated so they can determine eligibility for institutionally sponsored publishing agreements.
Until recently, an open, unambiguous, and persistent identifier for research organization affiliations has been a missing layer of the scholarly ecosystem. DOIs could identify articles and datasets and other research outputs, and ORCID IDs could identify researchers, but no equivalent solution was available to identify institutions. With the launch of the Research Organization Registry (ROR) in 2019 (which Crossref has helped to develop), the landscape is changing. ROR IDs are an opportunity to make affiliation details easier for publishers to use and easier for those who rely on this data.
Affiliations are a key piece of Crossref metadata that has been missing, but will soon be supported in the Crossref metadata schema. This means that content registered with Crossref can be associated with a ROR IDs to enable better tracking and discovery of research and other publication outputs by institution.
What is ROR?
ROR is the Research Organization Registry––open, noncommercial, community-led infrastructure for research organization identifiers. The registry currently includes globally unique persistent identifiers and associated metadata for more than 98,000 research organizations (as of August 2020).
ROR IDs are specifically designed to be implemented in any system that captures institutional affiliations and to enable connections (via persistent identifiers and networked research infrastructure) between research organizations, research outputs, and researchers.
ROR IDs are interoperable with those in other identifier registries, including GRID (which provided the seed data that ROR launched with), Crossref Funder Registry, ISNI, and Wikidata. ROR data is available under a CC0 waiver and can be accessed via a public API and data dump.
ROR is not the first organization identifier to exist. But ROR is distinct because it is completely open, specifically focused on identifying affiliations, and collaboratively developed by, with, and for key stakeholders in scholarly communications. ROR is operated as a joint initiative by Crossref, DataCite, and California Digital Library, and was launched with seed data from GRID in collaboration with Digital Science. These organizations have invested resources into building an open registry of research organization identifiers that can be embedded in scholarly infrastructure to effectively link research to organizations.
Why care about ROR IDs in Crossref metadata?
Ed Pentz, Crossref’s Executive Director, explains the key role ROR can play in enriching Crossref metadata:
“Over the years Crossref has expanded the metadata it collects (for example, ORCID IDs and license URLs) based on the changing needs of our members and the scholarly research community. A key type of metadata that is missing from Crossref is affiliations. We’ve had a lot of feedback from members that adding affiliations should be a priority. At Crossref LIVE19 in Amsterdam, ROR was ranked joint first place for Crossref by the 100 plus attendees at the meeting. For the last few years we’ve been diligently working on the initiative and are very happy that ROR is now coming to fruition.”
Crossref metadata does include some affiliations already. But this data is not comprehensive or consistent, and appears as free-text strings only (even if originally sourced from a list of institutions). A search for UC Berkeley, for instance, returns multiple variants of the university’s name:
University of California, Berkeley
University of California-Berkeley
University of California Berkeley
UC Berkeley
And likely more…
While it isn’t too difficult for a human to guess that “UC Berkeley,” “University of California, Berkeley,” and “University of California at Berkeley” are all referring to the same university, a machine interpreting this information wouldn’t necessarily make the same connections. If you are trying to easily find all of the publications associated with UC Berkeley, you would need to run and reconcile multiple searches at best, or miss data completely at worst. This is where an affiliation identifier comes in: a single, unambiguous, standardized identifier that will always stay the same (for UC Berkeley, that would be https://ror.org/01an7q238).
ROR IDs for affiliations can transform the usability of Crossref metadata. While it’s crucial to have IDs for affiliations, it’s equally important that the affiliation data can be easily used. The ROR dataset is CC0, so ROR IDs and associated affiliation data can be freely and openly used and reused without any restrictions.
What does this mean for publishers?
As the Crossref schema update is being cleared for takeoff, this is a good time for publishers and publishing service providers to be thinking about adopting ROR.
ROR IDs can be useful in publishing workflows in a variety of ways. They can easily be implemented into manuscript tracking systems to identify the affiliations of submitting authors and co-authors. This can be done via a simple institution lookup that connects to the ROR API. Authors choose their affiliation from a dropdown list populated from ROR; they do not have to provide a ROR ID or even know that a ROR ID is being collected.
Upon publication, ROR affiliation data can be included when content is registered with Crossref. ROR IDs are also supported in the JATS XML format that many publishers use. Crossref metadata can be searched and crawled, and the Crossref API will make ROR IDs available so affiliation data can be captured by tools and services and fed into downstream reporting and tracking systems.
Get ready to ROR!
ROR is already working with a number of publishers and service providers that are planning to integrate ROR in their systems, map their affiliation data to ROR IDs, and/or include ROR IDs in publication metadata.
For example: Rockefeller University Press has already added the collection of ROR IDs to their publication workflow. Upon submission, the author selects an institutional affiliation from a dropdown list of options that comes from ROR. Rockefeller University Press also relies on this affiliation data for billing and licensing purposes to coordinate Gold Open Access publishing agreements.
In addition to publishers, libraries and repositories and other stakeholders are building in support for ROR. You can also see the list of active and in-progress ROR integrations here.
We know decisions about identifier adoption aren’t easy or immediate, so get in touch with ROR if you have questions or want to be more involved in the project. ROR holds regular community meetings and webinars and supports several community working groups for those interested in implementing ROR IDs and working with ROR data. This is a community-driven effort so we want to hear from you!