This year, metadata development is one of our key priorities and we’re making a start with the release of version 5.4.0 of our input schema with some long-awaited changes. This is the first in what will be a series of metadata schema updates.
What is in this update?
Publication typing for citations
This is fairly simple; we’ve added a ‘type’ attribute to the citations members supply. This means you can identify a journal article citation as a journal article, but more importantly, you can identify a dataset, software, blog post, or other citation that may not have an identifier assigned to it. This makes it easier for the many thousands of metadata users to connect these citations to identifiers. We know many publishers, particularly journal publishers, do collect this information already and will consider making this change to deposit citation types with their records.
Every year we release metadata for the full corpus of records registered with us, which can be downloaded for free in a single compressed file. This is one way in which we fulfil our mission to make metadata freely and widely available. By including the metadata of over 165 million research outputs from over 20,000 members worldwide and making them available in a standard format, we streamline access to metadata about scholarly objects such as journal articles, books, conference papers, preprints, research grants, standards, datasets, reports, blogs, and more.
Today, we’re delighted to let you know that Crossref members can now use ROR IDs to identify funders in any place where you currently use Funder IDs in your metadata. Funder IDs remain available, but this change allows publishers, service providers, and funders to streamline workflows and introduce efficiencies by using a single open identifier for both researcher affiliations and funding organizations.
As you probably know, the Research Organization Registry (ROR) is a global, community-led, carefully curated registry of open persistent identifiers for research organisations, including funding organisations. It’s a joint initiative led by the California Digital Library, Datacite and Crossref launched in 2019 that fulfills the long-standing need for an open organisation identifier.
We began our Global Equitable Membership (GEM) Program to provide greater membership equitability and accessibility to organizations in the world’s least economically advantaged countries. Eligibility for the program is based on a member’s country; our list of countries is predominantly based on the International Development Association (IDA). Eligible members pay no membership or content registration fees. The list undergoes periodic reviews, as countries may be added or removed over time as economic situations change.
Publisher metadata is one side of the story surrounding research outputs, but conversations, connections and activities that build further around scholarly research, takes place all over the web. We built Event Data to capture, record and make available these ‘Events’ –– providing open, transparent, and traceable information about the provenance and context of every Event. Events are comments, links, shares, bookmarks, references, etc.
In September 2018 we said Event Data was ‘production ready.’ What we meant was development of the service had reached a point where we expected no further major changes to the code, and we encouraged you to use it. What normally would have followed was a detailed handover to our operations team, for monitoring and performance management, and for Product Management to expand Event Data by adding new Crossref member domains and evaluating additional event sources.
Why so quiet?
But many things changed on the staff front, meaning 2019 was a year of reinvention for the Technical and Product teams and of critical knowledge sharing and learning –– Event Data had to take a back seat as we focused resources on other key projects (more on that later). From a technical perspective, we’ve found the Elasticsearch index is not performing well and the approach taken to specifically support data citations through Scholix has not really scaled.
When things go wrong, whether in ways you can or can’t anticipate, the most important thing is communication –– in dealing with the challenges we forgot to do that. We understand how frustrating that can be and we’re extremely sorry to have gone so quiet.
So, where are we today?
Event Data is important to us and clearly important to you too as you’ve contacted us about your use-cases and the reliability of the service. Event Data remains available and you’re welcome to use it, but you should expect instability to continue and be aware that it does not find events for DOIs/domains of our newer members (who joined Crossref since 2019) –– so we’re conscious it might be hard to say whether it’s a good fit for your project at this point.
What are we doing?
We have brought in additional expert Elasticsearch resources to assist with a separate project to migrate our REST API from SOLR to Elasticsearch. We’re making fantastic progress on this. As soon as we’re confident we can make this switch, we will move those same Elasticsearch resources to shoring up Event Data. The REST API takes priority over Event Data because we need to add support for important new record types (like research grants) that aren’t yet available via the API.
We’re also concluding the process of hiring two new Product Managers which means we’ll be in a position to assign someone to head up the product management of Event Data. When we do return to Event Data in the coming months, our initial priority will be increased support for data citation and Scholix. If that means radical changes to the rest of the service, we’ll let you know.
Opening up the discussion
We will have more news on Event Data in mid-2020. We’d love you to join the Crossref Community Forum; we’ve created a new Category for Event Data where you can post details of how you are using, or plan to use Event Data; post questions to the group; suggestions for future development and provide general feedback on the Event Data service.