Blog

The Thing About DOI

Tony Hammond

Tony Hammond – 2008 June 30

In Discussion

With Library of Congress sometime back (Feb. ’08) announcing LCCN Permalinks and NLM also (Mar. ’08) introducing simplified web links with its PubMed identifier one might be forgiven for wondering what is the essential difference between a DOI name and these (and other) seemingly like-minded identifiers from a purely web point of view. Both these identifiers can be accessed through very simple URL structures:

With Library of Congress sometime back (Feb. ’08) announcing LCCN Permalinks and NLM also (Mar. ’08) introducing simplified web links with its PubMed identifier one might be forgiven for wondering what is the essential difference between a DOI name and these (and other) seemingly like-minded identifiers from a purely web point of view. Both these identifiers can be accessed through very simple URL structures:

Handle System Workshop

Tony Hammond

Tony Hammond – 2008 June 20

In Meetings

I was invited to speak at the Handle System Workshop which was run back to back with an IDF Open Meeting earlier this week in Brussels and hosted at the Office for Official Publications of the European Union. (Location was in the Charlemagne Building, at left in image, within the rather impressive meeting room Jean Durieux, at right.) My talk (‘A Distributed Metadata Architecture‘) was focussed on how OpenHandle and XMP could be leveraged to manage dispersed media assets.

PubMed Central Links to Publisher Full Text

Ed Pentz

Ed Pentz – 2008 June 12

In Member Briefing

A Crossref Member Briefing is available that explains how PubMed Central (PMC) links to publisher full text, how PMC uses DOIs and how PMC should be using DOIs. The briefing is entitled “Linking to Publisher Full Text from PubMed Central” (PDF 85k). Crossref considers it very important the PMC uses DOIs as the main means to link to the publisher version of record for an article and we are recommending that publishers try to convince PMC to use DOIs in an automated way.

Robots: One Standard Fits All

Tony Hammond

Tony Hammond – 2008 June 04

In Search

Interesting post from Yahoo! Search’s Director of Product Management, Priyank Garg, “One Standard Fits All: Robots Exclusion Protocol for Yahoo!, Google and Microsoft“. Interesting also for what it doesn’t talk about. No mention here of ACAP.

Exposing Public Data

Tony Hammond

Tony Hammond – 2008 May 31

In Discussion

As the range of public services (e.g. RSS) offered by publishers has matured this gives rise to the question: How can they expose their public data so that a user may discover them? Especially, with DOI there is now in place a persistence link infrastructure for accessing primary content. How can publishers leverage that infrastructure to advantage? Anyway, I offer this figure as to how I see the current lie of the land as regards DOI services and data.

Dark Side of the DOI

Tony Hammond

Tony Hammond – 2008 May 29

In Handle

(Click to enlarge.) For infotainment only (and because it’s a pretty printing). Glimpse into the dark world of DOI. Here, the handle contents for doi:10.1038/nature06930 exposed as a standard OpenHandle ‘Hello World’ document. Browser image courtesy of Processing.js and Firefox 3 RC1.

Referencing OpenURL

Tony Hammond

Tony Hammond – 2008 May 29

In Discussion

So, why is it just so difficult to reference OpenURL? Apart from the standard itself (hardly intended for human consumption - see abstract page here and PDF and don’t even think to look at those links - they weren’t meant to be cited!), seems that the best reference is to the Wikipedia page. There is the OpenURL Registry page at http://alcme.oclc.org/openurl/servlet/OAIHandler?verb=ListSets but this is just a workshop. Not much there beyond the OpenURL registered items.

Tombstone

Tony Hammond

Tony Hammond – 2008 May 23

In Identifiers

So, the big guns have decided that XRI is out. In a message from the TAG yesterday, variously noted as being “categorical” (Andy Powell, eFoundations) and a “proclamation” (Edd Dumbill, XML.com), the co-chairs (Tim Berners-Lee and Stuart Williams) had this to say: “We are not satisfied that XRIs provide functionality not readily available from http: URIs. Accordingly the TAG recommends against taking the XRI specifications forward, or supporting the use of XRIs as identifiers in other specifications.

Metadata Reuse Policies

Tony Hammond

Tony Hammond – 2008 May 20

In Metadata

Following on from yesterday’s post about making metadata available on our Web pages, I wanted to ask here about “metadata reuse policies”. Does anybody have a clue as to what might constitute a best practice in this area? I’m specifically interested in license terms, rather than how those terms would be encoded or carried. Increasingly we are finding more channels to distribute metadata (RSS, HTML, OAI-PMH, etc.) but don’t yet have any clear statement for our customers as to how they might reuse that data.

Nature’s Metadata for Web Pages

Tony Hammond

Tony Hammond – 2008 May 19

In Metadata

Well, we may not be the first but wanted anyway to report that Nature has now embedded metadata (HTML meta tags) into all its newly published pages including full text, abstracts and landing pages (all bar four titles which are currently being worked on). Metadata coverage extends back through the Nature archives (and depth of coverage varies depending on title). This conforms to the W3C’s Guideline 13.2 in the Web Content Accessibility Guidelines 1.0 which exhorts content publishers to “provide metadata to add semantic information to pages and sites”.

Metadata is provided in both DC and PRISM formats as well as in Google’s own bespoke metadata format. This generally follows the DCMI recommendation “Expressing Dublin Core metadata using HTML/XHTML meta and link elements, and the earlier RFC 2731 “Encoding Dublin Core Metadata in HTML”. (Note that schema name is normalized to lowercase.) Some notes:

  • The DOI is included in the “dc.identifier” term in URI form which is the Crossref recommendation for citing DOI.
    • We could consider adding also “prism.doi” for disclosing the native DOI form. This requires the PRISM namespace declaration to be bumped to v2.0. We might consider synchronizing this change with our RSS feeds which are currently pegged at v1.2, although note that the RSS module mod_prism currently applies only to PRISM v1.2.
      • We could then also add in a “prism.url” term to link back (through the DOI proxy server) to the content site. The namespace issue listed above still holds.
        • The “citation_” terms are not anchored in any published namespace which does make this term set problematic in application reuse. It would be useful to be able to reference a namespace (e.g. “rel="schema.gs" href="..."“) for these terms and to cite them as e.g. “gs.citation_title“.
        The HTML metadata sets from an example landing page are presented below.