At the end of last year, we were excited to announce our renewed commitment to community and the launch of three cross-functional programs to guide and accelerate our work. We introduced this new approach to work towards better cross-team alignment, shared responsibility, improved communication and learning, and make more progress on the things members need.
This year, metadata development is one of our key priorities and we’re making a start with the release of version 5.4.0 of our input schema with some long-awaited changes. This is the first in what will be a series of metadata schema updates.
What is in this update?
Publication typing for citations
This is fairly simple; we’ve added a ‘type’ attribute to the citations members supply. This means you can identify a journal article citation as a journal article, but more importantly, you can identify a dataset, software, blog post, or other citation that may not have an identifier assigned to it. This makes it easier for the many thousands of metadata users to connect these citations to identifiers. We know many publishers, particularly journal publishers, do collect this information already and will consider making this change to deposit citation types with their records.
Every year we release metadata for the full corpus of records registered with us, which can be downloaded for free in a single compressed file. This is one way in which we fulfil our mission to make metadata freely and widely available. By including the metadata of over 165 million research outputs from over 20,000 members worldwide and making them available in a standard format, we streamline access to metadata about scholarly objects such as journal articles, books, conference papers, preprints, research grants, standards, datasets, reports, blogs, and more.
Today, we’re delighted to let you know that Crossref members can now use ROR IDs to identify funders in any place where you currently use Funder IDs in your metadata. Funder IDs remain available, but this change allows publishers, service providers, and funders to streamline workflows and introduce efficiencies by using a single open identifier for both researcher affiliations and funding organizations.
As you probably know, the Research Organization Registry (ROR) is a global, community-led, carefully curated registry of open persistent identifiers for research organisations, including funding organisations. It’s a joint initiative led by the California Digital Library, Datacite and Crossref launched in 2019 that fulfills the long-standing need for an open organisation identifier.
The first version of our metadata input schema (a DTD, to be specific) was created in 1999 to capture basic bibliographic information and facilitate matching DOIs to citations. Over the past 20 years the bibliographic metadata we collect has deepened, and we’ve expanded our schema to include funding information, license, updates, relations, and other metadata. Our schema isn’t as venerable as a MARC record or as comprehensive as JATS, but it’s served us well. It’s not currently positioned to fully support everything we want to do long term - we’d like to support assertions, map cleanly to JATS and schema.org magically at the same time, and maybe even move beyond XML - but for now it’s something we can work with to empower member metadata to help find, cite, and connect scholarly content.
We’ve maintained backwards compatibility for most things since 2007 but this update will require some moderate changes to how contributors are modeled. The balance between supporting established tagging and addressing the evolution of what we collect and how it is expressed can be tricky. We want to collect good metadata without significantly disrupting the workflow of our membership, who are the source of the metadata. Even so, this is a fairly pragmatic update that will position us well for the future. I look forward to supporting new types of content and metadata in the future, but for now take a look at what I’m proposing.
I’m proposing some updates and additions to the metadata we collect, and would like your feedback. To fully and elegantly support affiliation identifiers and multiple author roles, we need to break backwards compatibility. Specifically, we want to:
Add support for CRediT
The CASRAI CRediT taxonomy is increasingly used to represent roles common to contributors to research outputs. Our members are applying CRediT to contributors, so we want to capture them as well. Supporting CRediT allows Crossref and our membership to identify and credit contributors beyond authors and editors.
As most of you know, a contributor often does more than one thing - they write, they edit, they curate. We currently only allow one contributor role as an attribute, but, to realistically support CRediT and accurately capture evidence about the work, we need to allow multiple contributor roles. This will break backwards compatibility. We can potentially support the old way and the new way, but I’m trying to avoid awkward compromises wherever possible.
Supporting CRediT doesn’t mean you need to adopt CRediT. We’ll continue to support existing author roles, but they’ll be marked up differently. Details are in our request for feedback document.
Expand support for author and organization identifiers
We collect ORCID iDs in our metadata but do not currently support other types of contributor identifiers. We also don’t support affiliation or organization identifiers beyond those assigned within our funder and clinical trial registries. We’ve had increasing demands from both metadata suppliers and users to expand support for affiliation identifiers because…identifiers are useful. We also want to expand author identifier support as ORCID IDs may only be registered by researchers who are able to curate their own ORCID record. Adding support for ISNI and Wikidata IDs is a common request, but we anticipate there’s a need for other identifiers as well.
Our plan is to accept identifiers registered with identifiers.org as well as other identifiers upon request. We prefer to remain consistent with the identifiers.org registry as much as possible.
We’re particularly keen to support open community-led identifiers like ORCID and ROR and will continue to do so, but also want to support the metadata our members want to distribute. Organization identifiers will be particularly useful as they’ll help us populate records with ROR IDs in the future, leading to better quality affiliation metadata.
Expand support for a range of contributor names
We currently require a surname for all contributors, and don’t provide comprehensive support for contributors whose names are represented by multiple alphabets, or who have nicknames or aliases, or who don’t have a surname. To begin with, we’ll replace surname with the more widely used ‘family name’ and remove the fixed surname requirement, allowing only a given name to be provided where appropriate. We’ll also allow a variety of names to be provided for each contributor.
Expand affiliation support
We currently collect affiliation as a single string - we’re going to break that up to support affiliation names, and add in support for organizational identifiers like ROR.
Expand support for data citation
For those of you who send us references, we’re adding a few fields to better support data citation. We’re also going to allow you to (optionally) supply a specific publication type for references.
Other updates
We’re making some other small updates as well. If you have a small request, we may be able to accommodate it in our next update. Larger changes or additions will probably have to wait for future updates, but we’d love to start collecting suggestions now.
We need your feedback!
I’ll be giving a webinar on December 19 at 02:00 and 15:00 UTC to go over these changes in detail - please visit our webinars page to register.
Again, please leave feedback, ask questions, and make suggestions in the feedback document, or if you prefer send feedback via email to feedback@crossref.org. We’ll be taking feedback through January 15, 2020.