This year, metadata development is one of our key priorities and we’re making a start with the release of version 5.4.0 of our input schema with some long-awaited changes. This is the first in what will be a series of metadata schema updates.
What is in this update?
Publication typing for citations
This is fairly simple; we’ve added a ‘type’ attribute to the citations members supply. This means you can identify a journal article citation as a journal article, but more importantly, you can identify a dataset, software, blog post, or other citation that may not have an identifier assigned to it. This makes it easier for the many thousands of metadata users to connect these citations to identifiers. We know many publishers, particularly journal publishers, do collect this information already and will consider making this change to deposit citation types with their records.
Every year we release metadata for the full corpus of records registered with us, which can be downloaded for free in a single compressed file. This is one way in which we fulfil our mission to make metadata freely and widely available. By including the metadata of over 165 million research outputs from over 20,000 members worldwide and making them available in a standard format, we streamline access to metadata about scholarly objects such as journal articles, books, conference papers, preprints, research grants, standards, datasets, reports, blogs, and more.
Today, we’re delighted to let you know that Crossref members can now use ROR IDs to identify funders in any place where you currently use Funder IDs in your metadata. Funder IDs remain available, but this change allows publishers, service providers, and funders to streamline workflows and introduce efficiencies by using a single open identifier for both researcher affiliations and funding organizations.
As you probably know, the Research Organization Registry (ROR) is a global, community-led, carefully curated registry of open persistent identifiers for research organisations, including funding organisations. It’s a joint initiative led by the California Digital Library, Datacite and Crossref launched in 2019 that fulfills the long-standing need for an open organisation identifier.
We began our Global Equitable Membership (GEM) Program to provide greater membership equitability and accessibility to organizations in the world’s least economically advantaged countries. Eligibility for the program is based on a member’s country; our list of countries is predominantly based on the International Development Association (IDA). Eligible members pay no membership or content registration fees. The list undergoes periodic reviews, as countries may be added or removed over time as economic situations change.
The integrity of the scholarly record is an essential aspect of research integrity. Every initiative and service that we have launched since our founding has been focused on documenting and clarifying the scholarly record in an open, machine-actionable and scalable form. All of this has been done to make it easier for the community to assess the trustworthiness of scholarly outputs. Now that the scholarly record itself has evolved beyond the published outputs at the end of the research process – to include both the elements of that process and its aftermath – preserving its integrity poses new challenges that we strive to meet… we are reaching out to the community to help inform these efforts.
Scholarly research, and therefore scholarly communications, are rapidly changing with the development of new approaches, technologies, and models. We need open scholarly infrastructure that can adapt to these changes and provide trust signals that enable assessment of the integrity of the research and reflect the ways that research is changing. Crossref has been changing and adapting by building on the concept of the scholarly record with our vision of the Research Nexus:
“a rich and reusable open network of relationships connecting research organizations, people, things, and actions; a scholarly record that the global community can build on forever, for the benefit of society”.
The foundation of the scholarly record and Research Nexus is metadata and relationships - the richer and more comprehensive the metadata and relationships in Crossref records, the more context there is for our members and for the whole scholarly research ecosystem. This will lead to a range of benefits from better discovery and saving researchers time to the assessment of research impact and research integrity. This is why Crossref is focused on enriching metadata to provide more and better trust signals while keeping barriers to membership and participation as low as possible to enable an inclusive scholarly record.
We want to engage with the community to emphasise this role, share our plans for the future, and get feedback to establish if we are heading in the right direction.
This blog explains our current position and will be followed by subsequent posts exploring all our services and plans in this area, as well as more details on our membership operations and policies.
What is “Integrity of the Scholarly Record” (ISR), and how does it feed into Research Integrity?
The US National Institutes of Health (NIH) defines research integrity as a set of values in scientific research: honesty; accuracy; efficiency; and objectivity. It’s concerned with the soundness of the process of science. As a subset of that, the outputs of the scholarly publishing process create a “scholarly record” which allows those in the community to find evidence and context to help confirm whether these values have been adhered to. The scholarly record is Crossref’s focus. This means that Crossref itself doesn’t assess the quality of content or the integrity of the research process but rather enables those who produce scholarly outputs to provide metadata (effectively evidence) about how they ensure the quality of content and how the outputs fit into the scholarly record (through reference links, ORCID iDs for authors, ROR IDs for affiliations, funding and licensing information, etc.).
Crossref members include any organisation that produces research objects and materials (publishers, societies, universities, funders, research institutions, scholars) so they can establish a persistent record—tied to a persistent and unique identifier—for these outputs and supply metadata about this content in an open, machine-readable way. Maintaining this record for the long term, and adding in an important layer of context, establishes the integrity of the scholarly record as well as ensuring it is something that can be used by the whole community to improve scholarly research for generations to come.
The scholarly record is about more than just published outputs - it’s also a network of inputs, relationships, and contexts
In the past, the Scholarly Record was seen as just the published outputs at the end of the research process - for example, journal articles or book chapters. But as the OCLC Research Group notes in their 2014 report on The Evolving Scholarly Record:
“The boundaries of the scholarly record are in flux, as they stretch to extend over an ever-expanding range of materials.”
OCLC describes how outputs at the “process” and “aftermath” stages of the research process are becoming increasingly important alongside the outputs at the traditional “outcomes” stage.
We like to take this even further. We think the evolving Scholarly Record is about more than just recording different types of works. As the above report notes “The scholarly record is evolving to have greater emphasis on collecting and curating context of scholarly inquiry […] One can imagine an article in quantitative biology published in a Wiley journal, the data for which resides in Dryad; the e-print in arXiv; and the conference poster in F1000. All of these materials may be considered part of the scholarly record, but no single institution will collect them all. Instead, access is achieved through a coordination of stewardship roles in which the scholarly record is decomposed into discrete, interrelated units that organisations specialize in collecting, preserving, and making available.”
It’s this interrelatedness that we think is important, and Crossref plays an important role in collecting, matching, and sharing those relationships. We now focus on this ‘nexus’ - so no longer primarily the different types of objects, but increasingly the interplay and relationships between them. The context, rather than the individual metadata elements, is what’s key.
Martin Eve explores this idea further in his blog What is the Scholarly Record, suggesting “the scholarly record is a decentralized network of evolving truth assertions” and “Whether a truth assertion is part of the scholarly record is determined by another set of distributed assertions and their power configurations (say, through institutional affiliation) of the individuals who make such assertions.”
Fister highlights the “SIFT” approach from A Curriculum for Civic Online Reasoning, created by a group of educators at Stanford University for students to evaluate online content. And she argues that this approach is also useful for assessing scholarly materials noting
“The networked, social nature of scholarship is worth making explicit”.
Where does Crossref fit in? Where do we have the most impact and opportunity?
To address the question of our role in the integrity of the scholarly record, we need to understand several aspects that Crossref has to balance in this capacity, such as
We don’t have the means or desire to be the arbiter of research quality. However, we operate neutrally, at the centre of scholarly communications, and we can help develop a shared consensus or framework. Our metadata elements and tools can be positioned to signal or detect trustworthiness. An important distinction is that we can play a role in assessing legitimacy but not in assessing quality.
We must be cautious that our best practices for demonstrating legitimacy and handling less-than-legitimate behaviour do not raise already-high barriers for emerging publications or organisations that present in ways that some may not recognise as professional standards. Disruption is different from deception. In discussions with our board this point has come out strongly: that Crossref has an opportunity to think about how to help the community identify deceptive actions and pair that with our efforts to bring more people on board.
Addressing this issue may involve changes to our membership eligibility and processes, bylaws, policies, staff resources, and technical and metadata solutions; actually, a combination of all these aspects. Many of these are projects that are already planned and we have ideas for extending these.
We regularly review the process we use for evaluating when and why to revoke membership for reasons other than non-payment. The volume of cases that we believe justify membership revocation—while a tiny fraction of members—is growing and does take staff and legal resources to address.
Crossref and our members aleady help preserve the integrity of the scholarly record in significant ways
Almost all of our services in some way touch on enabling people to express and evaluate trustworthiness; our mission statement commits us to “making research objects easy to find, cite, link, assess, and reuse […] all to help put research in context.”
We have, of course, specific tools and services that augment this activity too. Many members are active in:
Reporting corrections and retractions through Crossmark metadata.
As recently concluded in this Nature editorial calling for us to think beyond open references,
“Depositing all relevant metadata in Crossref should become the norm in scholarly publishing.”
For those members just starting out on their journey, there are some immediate specific things that all members are able to do. Check your participation report and start registering more metadata to add that contextual layer:
Cite data (preferably using DataCite DOIs in reference lists)
Register all related objects such as versions and translations via relationships
Register grants with Crossref (funder members).
By enabling our members to register their research objects and create metadata records about them that are freely and openly shared with the scholarly community, we facilitate them in being able to communicate the context and trustworthiness of that object.
And within that metadata, they can create relationships not just between research objects and also between research stakeholders - the individuals, affiliations, funders, and other players involved. That’s why we work so closely with other parts of foundational scholarly infrastructure (ORCID, DataCite, ROR) and why we now have more than 30 funders registering grants with us. We want to help to capture, identify, and link together all these important elements and more to deliver context for the scholarly record.
We started this blog by talking about the changes that are taking place in the world of research and how the infrastructure needs to adapt and change. Although we have extensive plans in place to improve our contribution to ISR, we need your help to establish whether our role is still the right one, whether we are missing anything and what else we might be able to do.
Join the discussion about the integrity of the scholarly record, and the Research Nexus on our Community Forum.
Keep an eye out for future blog posts and meetings. We are having a small, in-person discussion prior to the Frankfurt Book Fair and will report on this in a future blog post.
Sign up to attend Crossref LIVE22 for updates on these topics and all things Crossref.
Join and support initiatives and organisations that we partner with or who use our metadata to look at ethical practices in publishing, for example, COPE, DOAJ, and OASPA, and review the Principles of Transparency in Scholarly Publishing, which these organisations worked on with WAME.
In the coming weeks, we will post more about our product and metadata plans and also about the specifics of membership operations and cases we see and how we’re currently addressing them.