This year, metadata development is one of our key priorities and we’re making a start with the release of version 5.4.0 of our input schema with some long-awaited changes. This is the first in what will be a series of metadata schema updates.
What is in this update?
Publication typing for citations
This is fairly simple; we’ve added a ‘type’ attribute to the citations members supply. This means you can identify a journal article citation as a journal article, but more importantly, you can identify a dataset, software, blog post, or other citation that may not have an identifier assigned to it. This makes it easier for the many thousands of metadata users to connect these citations to identifiers. We know many publishers, particularly journal publishers, do collect this information already and will consider making this change to deposit citation types with their records.
Every year we release metadata for the full corpus of records registered with us, which can be downloaded for free in a single compressed file. This is one way in which we fulfil our mission to make metadata freely and widely available. By including the metadata of over 165 million research outputs from over 20,000 members worldwide and making them available in a standard format, we streamline access to metadata about scholarly objects such as journal articles, books, conference papers, preprints, research grants, standards, datasets, reports, blogs, and more.
Today, we’re delighted to let you know that Crossref members can now use ROR IDs to identify funders in any place where you currently use Funder IDs in your metadata. Funder IDs remain available, but this change allows publishers, service providers, and funders to streamline workflows and introduce efficiencies by using a single open identifier for both researcher affiliations and funding organizations.
As you probably know, the Research Organization Registry (ROR) is a global, community-led, carefully curated registry of open persistent identifiers for research organisations, including funding organisations. It’s a joint initiative led by the California Digital Library, Datacite and Crossref launched in 2019 that fulfills the long-standing need for an open organisation identifier.
We began our Global Equitable Membership (GEM) Program to provide greater membership equitability and accessibility to organizations in the world’s least economically advantaged countries. Eligibility for the program is based on a member’s country; our list of countries is predominantly based on the International Development Association (IDA). Eligible members pay no membership or content registration fees. The list undergoes periodic reviews, as countries may be added or removed over time as economic situations change.
Continuing our blog series highlighting the uses of Crossref metadata, we talked to Martyn Rittman and Bastien Latard who tell us about themselves, MDPI and Scilit, and how they use Crossref metadata.
Can you give us a brief introduction yourselves, and to MDPI/Scilit
Martyn is Publishing Services Manager at MDPI. He joined five years ago as an editor and has worked on editorial, production, and software projects. Prior to joining MDPI, he completed a PhD and worked as a postdoc. His research covered physical chemistry, biochemistry and instrument development.
Bastien Latard is the project leader of Scilit. He created Scilit as part of his Master’s degree in 2013. He is now completing a PhD on the subject of semantically linking research articles, using data from Scilit.
Scilit was developed in 2014 by open access (OA) publisher MDPI with the goal of having a backup of metadata for all OA articles. Soon, Scilit became more general and embraced all articles with a digital object identifier (DOI) from Crossref and those with a Pubmed ID (PMID). After seeing the potential of the database and how it could be used in a number of different contexts, we decided to make it public. Recently, other article types, including preprints have been integrated. Our main goal now is to provide useful services to the research and academic publishing communities.
What problem is your service trying to solve?
Other indexing databases offer paid access, are highly selective, or host documents apart from research articles. We want to offer a comprehensive database, but also one that clearly identifies open access material. The last part is still a work in progress, but we have made good progress recently.
To make the access as direct as possible, we have recently integrated several OA aggregators that pick up or host free versions of full-text articles, including CORE, Unpaywall, and PubMed Central.
Can you tell us how you are using the Crossref Metadata API at MDPI/Scilit?
Scilit queries Crossref’s API in order to index metadata for single articles. DOIs are a key part of the system; because they are standards, we can use them to merge new sources into Scilit while avoiding duplicates. We cross-check the data from Crossref against other sources and update it as necessary. Citation data is also really appreciated and opens doors to further developments.
As a publisher, MDPI makes daily deposits to Crossref, to register journal articles on mdpi.com, conference papers from sciforum.net, and preprints from Preprints.org. We also use the data collected at Scilit to find suitable reviewers and let authors know when their work has been cited.
What metadata values do you pull from the API?
As much as we can! Scilit crawls the latest indexed articles every few hours to ensure it is as up-to-date as possible. This is the most important function of our system because it provides metadata for the very latest published articles, including a link to the publisher version. Scilit parses Crossref metadata and saves them. They are then indexed into our solr search engine for fast, real-time usage.
Have you built your own interface to extract this data?
We wrote our own code to get the data, but the API interface made this very straightforward. Scilit has been developed completely in-house by MDPI and the lead developer, Bastien Latard, is currently completing a PhD looking at how to make the most of the data using semantic data extraction.
What are the future plans for MDPI/Scilit?
Scilit is and will be highly used in MDPI current and future projects. We have a few ideas about how to improve Scilit. We are, for example, implementing a scientific profile networking service, which will allow scholars to build their own (scientific) network with lots of functionalities. We think that it will be a really good place to search, comment, exchange around articles… maybe even more!
What else would you like to see the REST API offer?
Crossref is already doing a great job, especially with its integrated citation data. Maybe further analysis and mapping of data about organizations and institutions would be an improvement.
Thank you Martin and Bastien. If you’d like to share how you use the Crossref Metadata APIs please contact the Community team.