This year, metadata development is one of our key priorities and we’re making a start with the release of version 5.4.0 of our input schema with some long-awaited changes. This is the first in what will be a series of metadata schema updates.
What is in this update?
Publication typing for citations
This is fairly simple; we’ve added a ‘type’ attribute to the citations members supply. This means you can identify a journal article citation as a journal article, but more importantly, you can identify a dataset, software, blog post, or other citation that may not have an identifier assigned to it. This makes it easier for the many thousands of metadata users to connect these citations to identifiers. We know many publishers, particularly journal publishers, do collect this information already and will consider making this change to deposit citation types with their records.
Every year we release metadata for the full corpus of records registered with us, which can be downloaded for free in a single compressed file. This is one way in which we fulfil our mission to make metadata freely and widely available. By including the metadata of over 165 million research outputs from over 20,000 members worldwide and making them available in a standard format, we streamline access to metadata about scholarly objects such as journal articles, books, conference papers, preprints, research grants, standards, datasets, reports, blogs, and more.
Today, we’re delighted to let you know that Crossref members can now use ROR IDs to identify funders in any place where you currently use Funder IDs in your metadata. Funder IDs remain available, but this change allows publishers, service providers, and funders to streamline workflows and introduce efficiencies by using a single open identifier for both researcher affiliations and funding organizations.
As you probably know, the Research Organization Registry (ROR) is a global, community-led, carefully curated registry of open persistent identifiers for research organisations, including funding organisations. It’s a joint initiative led by the California Digital Library, Datacite and Crossref launched in 2019 that fulfills the long-standing need for an open organisation identifier.
We began our Global Equitable Membership (GEM) Program to provide greater membership equitability and accessibility to organizations in the world’s least economically advantaged countries. Eligibility for the program is based on a member’s country; our list of countries is predominantly based on the International Development Association (IDA). Eligible members pay no membership or content registration fees. The list undergoes periodic reviews, as countries may be added or removed over time as economic situations change.
Aliaksandr Birukou is the Executive Editor for Computer Science at Springer Nature and is chair of the Group that has been working to establish a persistent identifier system and registry for scholarly conferences. Here Alex provides some background to the work and asks for input from the community:
Roughly one year ago, Crossref and DataCite started a working group on conference and project identifiers. With this blog post, we would like to share the specification of conference metadata and Crossmark for proceedings and are inviting the broader community to comment.
Why are conferences important?
One common misbelief is that most published research appears in journals. However, next to new ways of communication research results (blogs, presentations,…) and journals there are also other publication options, like books, very important in humanities, or conference proceedings, which are very important in computer science and a couple of related disciplines. Conference proceedings are collections of journal-like papers, often undergoing a more competitive peer review process than in journals. For instance, looking at original research in computer science in Scopus published in CS in 2012-2016, 63% of articles appeared in proceedings, while only 37% were published in journals. DBLP, one of the most important indexing services in CS, lists more than two million conference papers organized in ~5,400 conference series.
So, while it is true that CS has a significant share of conference proceedings, conferences are also relevant in many other disciplines which do not publish formal proceedings. For instance, inSPIRE contains ~23,000 conferences in high-energy physics, the American Society of Mechanical Engineers (ASME) publishes roughly 100 proceedings volumes annually.
Why do we need an open persistent ID for a conference or a conference series?
With publishers, learned societies, indexing services, libraries, conference management systems, research evaluation and funding agencies using conferences directly or indirectly in their daily work, a common vocabulary would simplify data processing, reporting and minimize errors. Right now, a publisher assigns a unique conference ID to the conference to be published, then an indexing service does it, then it is assigned in a library. Wouldn’t it be easier to do this at the very beginning of the process, when the conference planning starts, and keep the same identifier through the whole conference lifecycle?
The joint Crossref and DataCite group on conference and project identifiers has discussed this topic at half a dozen calls and various PID community meetings (PIDapalooza, FORCE conferences, AAHEP Information Provider Summit). The result of those discussions is a draft of the specification of conference metadata and Crossmark for proceedings.
The document first defines the concepts of a conference, conference series, joint and co-located conferences. It then introduces the information we want to store about those entities, e.g., the ID, name, acronym, other IDs, URL and the maintainer of the conference series, or the ID, conf series ID, number, dates, location, and URL for conferences. Such metadata can be submitted to Crossref and DataCite by conference organizers or publishers on their behalf and linked to the existing proceedings metadata, where appropriate. It can be then used for linking research outputs from a conference (beyond formal proceedings), recognizing reviewers via services such as ORCID and Publons, computing metrics of a conference series, conference disambiguation in indexing services and ratings (CORE, QUALIS, CCF), and so on.
The second part of the document introduces Crossmark for conference proceedings. Its goal is to structure and preserve the information about the peer review process of a conference as declared by the general or program chairs. Depending on how much information is available from the conference organizers, one can use the basic or extended versions of Crossmark.
In order to comment, please open the specification and leave comments using “comment” feature of Google Docs. The draft remains open for comments till the 31st of May 2018.
Next steps
After hearing from YOU, we will update the document to reflect the community comments. In parallel, we start a subgroup discussing the governance models, looking into whether we need a new membership category at Crossref, what fees should be covered, etc.