5 minute read.The research nexus - better research through better metadata
Researchers are adopting new tools that create consistency and shareability in their experimental methods. Increasingly, these are viewed as key components in driving reproducibility and replicability. They provide transparency in reporting key methodological and analytical information. They are also used for sharing the artifacts which make up a processing trail for the results: data, material, analytical code, and related software on which the conclusions of the paper rely. Where expert feedback was also shared, such reviews further enrich this record. We capture these ideas and build on the notion of the “article nexus” blogpost with a new variation: “the research nexus.”
Some of Crossref’s publishing community are encouraging the scholarly communication practices surrounding these tools in a variety of ways: incorporating them into the publishing workflow, integrations between the tools and publishing systems, as well as linking and exposing the artifacts in the publications for readers to access. A special set of publishers have gone all the way and included these links into their Crossref metadata record. They insert them directly into the metadata deposit when they register the content (technical documentation). Doing so, these connections reach further than the publisher platform and propagate to systems across the research ecosystem including places like indexers, research information management systems, sharing platforms (oh, the list goes on!). We highlight a small set of examples to illustrate how these outstanding publishing practices are supporting good research.
1. Linking to an entire collection of methods
Crossref member, Protocols.io, is supporting transparency and methods reproducibility with their open access repository of science methods. Leitão-Goncalves R, Carvalho-Santos Z,
Francisco AP, et al. investigated the concerted action of the commensal bacteria Acetobacter pomorum and Lactobacilli in Drosophila melanogaster, demonstrating how the interaction of specific nutrients within the microbiome can shape behavioral decisions and life history traits. Findings were published in PLOS Biology earlier this year: https://doi.org/10.1371/journal.pbio.2000862. Authors deposited detailed methods and protocols used in the project (Drosophila rearing, media preparations, and microbial manipulations) as a collection in Protocols.io: https://doi.org/10.17504/protocols.io.hdtb26n. So Protocols.io registered their content with us, linking the protocol to the paper. This creates the crosswalk between both so that users can get from one to the other through the metadata. The full metadata record can be found here.
2. Linking to video protocol
If a picture is worth a thousand words, the truism might apply to moving pictures many times over. Fasel B, Spörri J, Schütz P, et al. proposed a set of calibration movements optimized for alpine skiing and validated the 3D joint angles of the knee, hip, and trunk during alpine skiing in a PLOS ONE paper: https://doi.org/10.1371/journal.pone.0181446. These movements consisted of squats, trunk rotations, hip ad/abductions, and upright standing. The specific team responsible for designing them (Fasel B, Spörri J, Kröll J, and Aminian K) described the set of calibration movements performed but found videos to be a far more effective way to communicate the technical movements used in their study. They made the visuals available too: https://doi.org/10.17504/protocols.io.itrcem6. So Protocols.io deposited the link between video protocol and paper to the Crossref metadata record (full metadata record).
3. Linking to software and peer reviews
The Journal of Open Source Software (JOSS) is an academic journal about high quality research software across broadly diverse disciplines. Sara Mahar works on the effectiveness of organizations funded by the US Department of Housing and Urban Development to combat homelessness. She collaborated with computational physicist Matthew Bellis to create a python tool for researchers to visualize and analyze data from the Homeless Management Information System:https://doi.org/10.21105/joss.00384. The software was archived in Zenodo: https://doi.org/10.5281/zenodo.13750 and the peer review artifacts were also published. JOSS deposited all these links in the metadata record (found here).
4. Linking to preprint, data, code, source code, peer reviews
Gigascience, published by Oxford University Press, is experimenting with a number of new tools in their mission to promote reproducibility of analyses and data dissemination, organization, understanding, and use. In a recent paper Luo R, Schatz M, and Salzberg S shared the results of the firstly publicly available implementation of variant calling using a 16-genotype probabilistic model for germline variant detection: https://doi.org/10.1093/gigascience/gix045. Prior to formal peer review, the group posted the preprint in bioRxiv: https://doi.org/10.1101/111393. When the paper was published, the authors made the supporting data available, including snapshots of the test and result data, in a public repository: http://dx.doi.org/10.5524/100316. OUP included this data citation in their Crossref metadata record via the routes recommended in our previous blog post about depositing data citations. The researchers made the code available in Github, and the algorithm is ready for researchers to run on Code Ocean, a cloud-based computational reproducibility platform that allows researchers to wrap and encapsulate the data, code, and computation environment linked to an article: https://doi.org/10.24433/CO.0a812d9b-0ff3-4eb7-825f-76d3cd049a43. For further transparency, expert reviews of the manuscript from the peer review history were published in Publons: http://dx.doi.org/10.5524/review.100737 and http://dx.doi.org/10.5524/review.100738. (As of last month, publishers can register peer reviews at Crossref). The full metadata record contains links to the entire set of materials listed above.
5. Linking to preprint, Code, Docker hub, video, reviews
Narechania A, Baker R, DeSalle R, et al. used bird flocking behavior to design an algorithm, Clusterflock, for optimizing distance-based clusters in orthologous gene families that share an evolutionary history. Their paper was published in Gigascience last year: https://doi.org/10.1186/s13742-016-0152-3. Supporting data, code snapshots and video were published in GigaDB: http://dx.doi.org/10.5524/100247. Code was maintained in GitHub. And authors also created a Docker application for Clusterflock, a lightweight, stand-alone, executable package of the software which includes everything needed to run it: code, runtime, system tools, system libraries, settings (Docker Hub link here). They created a video demo of the algorithm. Publons reviews were published http://dx.doi.org/10.5524/review.100507 and http://dx.doi.org/10.5524/review.100508.
Gigascience shared all these assets in their publication, including the link to the original bioRxiv preprint: https://www.biorxiv.org/content/early/2016/03/25/045773). The full metadata record containing these links can be found here.
These five are just a few exemplary cases showing how publishers are declaring the relationships between their publications and other associated artifacts to support reproducibility and discoverability of their content. We welcome you to check out our overview of relationships between DOIs and other materials for more information. Members who are enriching your publishing pipeline in similar ways, please register these links to make your reach go further. We also welcome everyone to retrieve these relations in our REST API (technical documentation).