Blog

PDF-Extract

PDF-EXTRACT

Crossref Labs is happy to announce the first public release of ā€œpdf-extractā€ an open source set of tools and libraries for extracting citation references (and, eventually, other semantic metadata) from PDFs. We first demonstrated this tool to Crossref members at our annual meeting last year. See the pdf-extract labs page for a detailed introduction to this new set of tools.

If you are unable to download and install the tool, you can play with a experimental web interface called ā€œExtracto.ā€ Be warned, Extracto is running on very feeble server using an erratic and slow internet connection. The only guarantee that we can make about using it is that it will repeatedly fall over and annoy you. The weasel has spoken.

DOIs for PHD Comicsā€™ Valentineā€™s Day Reading List

Geoffrey Bilder

Geoffrey Bilder – 2012 February 14

In ComicsDOIs

PHD Comics has posted its Valentineā€™s Day Reading list. Without DOIs!    So in order to preserve the scholarly citation record, weā€™ve resolved those that have DOIs…. Title:  The St. Valentineā€™s Day Frontal Passage Citation:  Sassen, K, 1980, ‘The St. Valentineā€™s Day Frontal Passageā€™, Bulletin of the American Meteorological Society, vol. 61, no. 2, p. 122. Crossref DOI:  http://dx.doi.org/10.1175/1520-0477(1980)061<0122:TSVDFP>2.0.CO;2 Title:  SUICIDE AND HOMICIDE ON ST. VALENTINEā€™S DAY Citation:  LESTER, D, 1990, ‘SUICIDE AND HOMICIDE ON ST.

Turning DOIs into formatted citations

Today two new record types were added to dx.doi.org resolution for Crossref DOIs. These allow anyone to retrieve DOI bibliographic metadata as formatted bibliographic entries. To perform the formatting weā€™re using the citation style language processor, citeproc-js which supports a shed load of citation styles and locales. In fact, all the styles and locales found in the CSL repositories, including many common styles such as bibtex, apa, ieee, harvard, vancouver and chicago are supported.

Determining the Crossref membership status of a domain

Weā€™ve been asked a few times if it is possible to determine whether or not a particular domain name belongs to a Crossref member. To address this weā€™re launching another small service that performs something like a ā€œreverse look-upā€ of URLs and domain names to DOIs and Crossref member status. The service provides an API that will attempt to reverse look-up a URL to a DOI and return the membership status (member or non-member) of the root domain of the URL.

DataCite supporting content negotiation

In April In April for its DOIs. At the time I cheekily called-out DataCite to start supporting content negotiation as well. Edward Zukowski (DataCiteā€™s resident propellor-head) took up the challenge with gusto and, as of September 22nd DataCite has also been supporting content negotiation for its DOIs. This means that one million more DOIs are now linked-data friendly. Congratulations to Ed and the rest of the team at DataCite. We hope this is a trend.

Family Names Service

Karl Ward

Karl Ward – 2011 October 06

In APIsFamily Names

Today Iā€™m announcing a small web API that wraps a family name database here at Crossref R&D. The database, built from Crossrefā€™s metadata, lists all unique family names that appear as contributors to articles, books, datasets and so on that are known to Crossref. As such the database likely accounts for the majority of family names represented in the scholarly record. The web API comes with two services: a family name detector that will pick out potential family names from chunks of text and a family name autocompletion system.

Content Negotiation for Crossref DOIs

So does anybody remember the posting DOIs and Linked Data: Some Concrete Proposals? Well, we went with option ā€œD.ā€ From now on, DOIs, expressed as HTTP URIs, can be used with content-negotiation. Letā€™s get straight to the point. If you have curl installed, you can start playing with content-negotiation and Crossref DOIs right away: curl -D - -L -HĀ ā€œAccept: application/rdf+xmlā€ ā€œhttp://dx.doi.org/10.1126/science.1157784ā€Ā curl -D - -L -HĀ ā€œAccept: text/turtleā€ ā€œhttp://dx.doi.org/10.1126/science.1157784ā€

Monitoring Crossref Technical Developments

Anna Tolwinska

Anna Tolwinska – 2011 March 29

In Support

Announcements regarding Crossref system status or changes are posted in an Announcements forum on our support portal (http://support.crossref.org). We recommend that someone from your organization monitor this forum to stay informed about Crossref system status, schema changes, or other issues affecting deposits and queries. Subscribe to this forum via RSS feed (https://support.crossref.org/hc/en-us) or select the ā€˜Subscribeā€™ option in the forum to subscribe by email. The TWG Discussion forum replaces the TWG mailing list and can be accessed by members of the Crossref community who log in to our support portal.

Add linked images to PDFs

Geoffrey Bilder

Geoffrey Bilder – 2010 August 16

In Crossref LabsPDF

While working on an internal project, we developed ā€œpdfstampā€œ, a command-line tool that allows one to easily apply linked images to PDFs. We thought some in our community might find it useful and have released it on github. Some more PDF-related tools will follow soon.

XMP in RSC PDFs

Crossref

admin – 2010 August 03

In IdentifiersPDFXMPInChI

Just a quick heads-up to say that weā€™ve had a go at incorporating InChIs and ontology terms into our PDFs with XMP. There isnā€™t a lot of room in an XMP packet so weā€™ve had to be a bit particular about what we include. InChIs: the bigger the molecule the longer the InChI, so weā€™ve standardized on the fixed-length InChIKey. This doesnā€™t mean anything on its own, so weā€™ve gone the Semantic Web route of including an InChI resolver HTTP URI.