Abstract
Drug discovery and subsequent availability of a new breakthrough therapeutic or ‘cure’ is a compelling example of societal benefit from research advances. These advances are invariably collaborative, involving the contributions of many scientists to a discovery network in which theory and experiment are built upon. To understand such scientific advances, data mining of public and commercial data sources coupled with network analysis can be used as a digital methodology to assemble and analyze component events in the history of a therapeutic. This methodology is extensible beyond the history of therapeutics and its use more generally supports (i) efficiency in exploring the scientific history of a research advance (ii) documenting and understanding collaboration (iii) portfolio analysis, planning and optimization (iv) communication of the societal value of research. As a proof of principle, we have conducted a case study of five anti-cancer therapeutics. We have linked the work of roughly 237,000 authors in 106,000 scientific publications that capture the research crucial for the development of these five therapeutics. We have enriched the content of networks of these therapeutics by annotating them with information on research awards as well as peer review that preceded these awards. Applying retrospective citation discovery, we have identified a core set of publications cited in the networks of all five therapeutics and additional intersections in combinations of networks as well as awards from the National Institutes of Health that supported this research. Lastly, we have mapped these awards to their cognate peer review panels, identifying another layer of collaborative scientific activity that influenced the research represented in these networks.