Blog

Joe Wass

Joe Wass is Head of the Software Development team, making sure we build and run the right software for our community. He spent his first five years in Crossref Labs getting to know our broad community, with a special focus on finding citations in new places on the web, keeping tabs on the evolving activities of scholars round the world.

Mending Chesterton’s Fence: Open Source Decision-making

Joe Wass

Joe Wass – 2024 March 18

In EngineeringPOSI

When each line of code is written it is surrounded by a sea of context: who in the community this is for, what problem we’re trying to solve, what technical assumptions we’re making, what we already tried but didn’t work, how much coffee we’ve had today. All of these have an effect on the software we write. By the time the next person looks at that code, some of that context will have evaporated.

Renewed Persistence

Joe Wass

Joe Wass – 2023 April 01

In Engineering

We believe in Persistent Identifiers. We believe in defence in depth. Today we’re excited to announce an upgrade to our data resilience strategy. Defence in depth means layers of security and resilience, and that means layers of backups. For some years now, our last line of defence has been a reliable, tried-and-tested technology. One that’s been around for a while. Yes, I’m talking about the humble 5ÂĽ inch floppy disk.

What’s that DOI?

Joe Wass

Joe Wass – 2019 January 21

In Event DataPidapalooza

This is a long overdue followup to 2016’s “URLs and DOIs: a complicated relationship”. Like that post, this accompanies my talk at PIDapalooza, the festival of open persistent identifiers). I don’t think I need to give a spoiler warning when I tell you that it’s still complicated. But this post presents some vocabulary to describe exactly how complicated it is. Event Data has been up and running and collecting data for a couple of years now, but this post describes changes we made toward the end of 2018.

Hear this, real insight into the inner workings of Crossref

You want to hear more from us. We hear you. We’ve spent the past year building Crossref Event Data, and hope to launch very soon. Building a new piece of infrastructure from scratch has been an exciting project, and we’ve taken the opportunity to incorporate as much feedback from the community as possible. We’d like to take a moment to share some of the suggestions we had, and how we’ve acted on them.

Bridging Identifiers at PIDapalooza

Hello from sunny Girona! I’m heading to PIDapalooza, the Persistent Identifier festival, as it returns for its second year. It’s all about to kick off. One of the themes this year is “bridging worlds”: how to bring together different communities and the identifiers they use. Something I really enjoyed about PIDapalooza last year was the variety of people who came. We heard about some “traditional” identifier systems (at least, it seems that way to us): DOIs for publications, DOIs for datasets, ORCIDs for researchers.

Event Data as Underlying Altmetrics Infrastructure at the 4:AM Altmetrics Conference

I’m here in Toronto and looking forward to a busy week. Maddy Watson and I are in town for the 4:AM Altmetrics Conference, as well as the altmetrics17 workshop and Hack-day. I’ll be speaking at each, and for those of you who aren’t able to make it, I’ve combined both presentations into a handy blog post, which follows on from my last one. But first, nothing beats a good demo. Take a look at our live stream.

You do want to see how it’s made — seeing what goes into altmetrics

There’s a saying about oil, something along the lines of “you really don’t want to see how it’s made”. And whilst I’m reluctant to draw too many parallels between the petrochemical industry and scholarly publishing, there are some interesting comparisons to be drawn. Oil starts its life deep underground as an amorphous sticky substance. Prospectors must identify oil fields, drill, extract the oil and refine it. It finds its way into things as diverse as aspirin, paint and hammocks.

URLs and DOIs: a complicated relationship

As the linking hub for scholarly content, it’s our job to tame URLs and put in their place something better. Why? Most URLs suffer from link rot and can be created, deleted or changed at any time. And that’s a problem if you’re trying to cite them.

Using AWS S3 as a large key-value store for Chronograph

One of the cool things about working in Crossref Labs is that interesting experiments come up from time to time. One experiment, entitled “what happens if you plot DOI referral domains on a chart?” turned into the Chronograph project. In case you missed it, Chronograph analyses our DOI resolution logs and shows how many times each DOI link was resolved per month, and also how many times a given domain referred traffic to DOI links per day.

HTTPS and Wikipedia

This is a joint blog post with Dario Taraborelli, coming from WikiCite 2016.

In 2014 we were taking our first steps along the path that would lead us to Crossref Event Data. At this time I started looking into the DOI resolution logs to see if we could get any interesting information out of them. This project, which became Chronograph, showed which domains were driving traffic to Crossref DOIs.

You can read about the latest results from this analysis in the “Where do DOI Clicks Come From” blog post.

Having this data tells us, amongst other things:

  • where people are using DOIs in unexpected places
  • where people are using DOIs in unexpected ways
  • where we knew people were using DOIs but the links are more popular than we realised