Blog

 9 minute read.

Members will soon be able to assign Crossref DOIs to preprints

Geoffrey Bilder

Geoffrey Bilder – 2016 May 05

In Preprints

TL;DR

By August 2016, Crossref will enable its members to assign Crossref DOIs to preprints. Preprint DOIs will be assigned by the Crossref member responsible for the preprint and that DOI will be different from the DOI assigned by the publisher to the accepted manuscript and version of record. Crossref’s display guidelines, tools and APIs will be modified in order to enable researchers to easily identify and link to the best available version of a document (BAV). We are doing this in order to support the changing publishing models of our members and in order to clarify the scholarly citation record.

Background

Why is this news? Well, to understand that you need to know a little Crossref history.

(cue music and fade to sepia) 


ukelele memoryWhen Crossref was founded, one of its major goals was to clarify the scholarly record by uniquely identifying formally published scholarly content on the web so that it could be cited precisely. At the time, our members had two primary concerns:

  • That a Crossref DOI should point to one intellectually discrete scholarly document. That is, they did not want one Crossref DOI to be assigned to two documents that appeared largely similar, but which might vary in intellectually significant ways.

  • That two DOIs should not point to the same intellectually discrete document. They wanted it to be easy for all to tell when the same discrete intellectual content was cited.

As such, when Crossref was founded, we developed a complex set of rules that were colloquially known by our members as Crossref’s rules “prohibiting the assignment of DOIs to duplicative content.”

(cue music, show wavy lines, return to color)

Well… as we gained experience in assigning DOIs, many of these rules have been amended or discarded when it became apparent that they didn’t actually support common scholarly citation practice and/or otherwise muddied the scholarly citation record.

For example, sometimes a document will be re-published in a special issue or an anthology. Before the advent of the DOI, it was common citation practice to always cite a document in the context in which it was read. The context of the document could, after all, affect the interpretation or crediting of the work. But it would be impossible to support this common citation practice if we were to assign the same Crossref DOI to the article on both its original context and in its re-published form. Our current recommendation in these situations is to assign separate DOIs to content that is republished in another context.

Another example occurs when a particular copy of a two identical documents has been annotated. For example, though the Handbook to The birds of Australia By John Gould has its own Crossref DOI (http://doi.org/10.5962/bhl.title.8367), another copy of the same book has been hand-annotated by Charles Darwin and also has its own, different Crossref DOI (http://doi.org/10.5962/bhl.title.50403). Historians of science quite reasonably may want to refer and cite the particular annotated copy of this historic document.

__[So much for not assigning two separate Crossref DOIs to identical documents.]

Finally, we should note a far more common example practice in our industry. Our members often make content available online with a Crossref DOI before they consider it to be formally published. This practice goes by a number of names including “publish ahead of print,” “article in progress,” “article in press,” “online ahead of print,” “online first”, etc.

But in each case, the process is the same- the publisher is assigning a Crossref DOI to the document soon after it has been accepted for publication and this same Crossref DOI is carried over to the finally published article. Again, this practice just reflects that the “intellectual” content of the accepted manuscript should not change between the point of acceptance and the point of publication, so of the purposes of “citation” they are largely interchangeable.

[So much for not assigning one Crossref DOI to two versions of the same document.]

Now, in the above cases it also helps to clarify the scholarly record to also specify that the respective Crossref DOIs of the original and the “duplicative” work are related, and we encourage our members to make these connections explicit when they can. Nonetheless, it is paramount in both cases to allow the “duplicative works” to be cited precisely and independently.

Which brings us back to preprints.

The case for preprints

First we should define what was meant by preprints because even this commonly used term sometimes means different things to different communities. We have historically considered preprints to be any version of a manuscript that is intended for publication but that has not yet been submitted to a publisher for formal review. Note that this definition does not include “accepted manuscripts” which -as we noted above- often already have Crossref DOIs assigned to them soon after acceptance.

Crossref members originally worried that, by assigning DOIs to preprints, we would end up muddying the scholarly record. They worried that the very presence of a Crossref DOI would be interpreted to mean that the content to which it had been applied had gone through a formal publishing process. And unlike the case with “accepted manuscripts”, the difference between intellectual content of a preprint and the final published version can sometimes be substantial. At the time, it seemed that the scholarly record would be clarified by prohibiting the assignment of DOIs to preprints.

But again, changes in the scholarly communication landscape have led us to -as the youngsters say- pivot.

A Koan

When is a preprint a preprint?

contemplative handCrossref has always been catholic in its definition of “publisher.” Many of our members do not consider “publishing” to be their primary mission. The OECD and World Bank are two obvious cases here. But our membership also includes government departments, universities and archives. In these latter cases they have traditionally assigned Crossref DOIs to things like internal reports, grey literature, working papers, etc. This activity was clearly within the original rules set out by Crossref. And this is where our koan comes into play- “when is a preprint a preprint?”

It is often difficult to predict when something might eventually be formally published. How do you a priori know that working paper will never be submitted for publication? After all, everything could potentially be submitted for publication (Sometimes it seems everything is.)

This is the dilemma that was faced by a few of our members. For example, Cold Spring Harbor Laboratory, which runs bioRxiv has been a Crossref member since 2000 and has assigned over 35,000 Crossref DOIs. They have been assiduous in trying to stick to Crossref’s rules about preprints. Furthermore, they have taken equal care to ensure that preprints in bioRxiv are labeled as such and linked to the final publication (via a Crossref journal DOI) when it is available. This takes a lot of work.

But often bioRxiv simply has no way of telling when the authors of a working paper or report might suddenly decide to submit their work for publication. So they have found themselves occasionally and inadvertently violating Crossref’s rules on preprints because they had no way of predicting when something would magically transform from being an innocuous working paper into a fraught preprint.

It is a testament to bioRxiv that they have persevered. We have other members who face the same problem. They have not given up. They have not gone elsewhere for their DOIs.

Which brings us to our next point.

Not All DOIs

Have you noticed how often we use the phrase “Crossref DOIs?” Were you wondering if this was an annoying affectation or an example of a marketing department gone mad? It’s neither. It is an essential distinction that we make because Crossref is just one of several DOI registration agencies. Although all DOIs are “compatible” in the minimal sense that you can “resolve” them to a location on the web, that does not mean that all DOIs work identically. Different DOI registration agencies have different constituencies, different services, different governance models and different rules covering what their members can assign their respective DOIs to.

This was not the case when Crossref was founded and our rules were first drafted. At the time, Crossref was the only registration agency and, as such, the rule which prohibited the assignment of Crossref DOIs to preprints kinda worked. But it was unworkable in the longer term.

Quite naturally, new DOI registration agencies have been established for different communities with different primary use-cases. While Crossref could have a rule prohibiting the assignment of Crossref DOIs to preprints, there was nothing stopping another registration agency from allowing (indeed, encouraging) its members to assign DOIs to preprints.

So the simple fact is that DOIs could be assigned to preprints regardless of Crossref’s old rules. By continuing to prohibit the practice at Crossref we were just making life for some of our existing members more difficult.

And it has become clear that the situation would only get worse as more of our members started to roll-out new publishing and business models.

Business model neutral 

Crossref has always been business model neutral. We need to adapt and change to support our members’ business models, not the other way around.

A number of our members are starting to adopt publishing workflows that are more fluid and public than established publishing models. These new workflows make much of the submission and review process open, which, in turn often blurs the historically hard distinctions between a draft manuscript, a preprint, a revised proof, an accepted manuscript, the “final” published version, and subsequent corrections and updates. Where as in classic publishing models a document went through a series of discrete state-changes (some in public, many in private) new publishing workflows treat document versions as a continuum, most of which are made available publicly and which consequently may be used cited at almost any point in the publishing process.

In short, Crossref’s members increasingly need the flexibility to assign DOIs at different points in the publishing lifecycle. Rather than enforce rules that enshrined an existing publishing or business model, we need to work with our members to establish and adopt new DOI assignment practices which support evolving publishing models whilst maintaining a clear citation record and which lets researchers easily identify the best available version (BAV) of a document or research object.

flinty-exteriorSo you see, not all of our motivations for this change in policy are opportunistic or prosaic. Underneath our gruff and flinty exterior is a soft, idealistic center. There are principles at work here as well.

What next

So this isn’t just matter of changing our rules and display guidelines. We also have to make some schema changes, and adjust our services and APIs to clearly distinguish between preprints and accepted manuscripts/versions of record. Additionally, we will be building tools to make it much easier for our members to link preprints to the final published article (and vice versa). Finally, we need to update our documentation to help our members take advantage of the new functionality. We expect that everything will be in place by the end of August, 2016, at which point you will see another announcement from us.

Further reading

Page owner: Geoffrey Bilder   |   Last updated 2016-May-05