A comprehensive analysis of the usability and archival stability of omics computational tools and resources

Serghei Mangul; Thiago Mosqueiro; Dat Duong; Keith Mitchell; Varuni Sarwal; Brian Hill; Jaqueline Brito; Russell Jared Littman; Benjamin Statz; Angela Ka-Mei Lam; Gargi Dayama; Laura Grieneisen; Lana S. Martin; Jonathan Flint; Eleazar Eskin; Ran Blekhman

doi:10.1101/452532

Abstract

Developing new software tools for analysis of large-scale, biological data is a key component of advancing computational, data-enabled research. Scientific reproduction of published findings requires running computational tools on data generated by such studies, yet little attention is presently allocated to the usability and archival stability of computer code encapsulated as computational software tools. Scientific journals require data and code sharing, but none currently require authors to guarantee software usability and long-term archival stability of newly published tools. We developed an accurate estimation of the accessibility of computational biology software tools by performing an empirical analysis of usability and archival stability of 24,490 omics software resources published from 2000 to 2017. We found that 26% of all omics software resources are currently not accessible through URLs published in the paper. Among the tools selected for our comprehensive and systematic usability test, 49% were deemed “difficult to install,” and 28% of the tools failed to be installed due to problems in the implementation. Moreover, for papers introducing new software, we found that the number of citations significantly increased when authors provided an easy installation process for published software. We propose for incorporation into journal policy several practical solutions for increasing the widespread usability and archival stability of published bioinformatics software.

The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.