PT - JOURNAL ARTICLE AU - Burns, Gully A AU - Dasigi, Pradeep AU - Hovy, Eduard H. TI - Extracting Evidence Fragments for Distant Supervision of Molecular Interactions AID - 10.1101/192856 DP - 2017 Jan 01 TA - bioRxiv PG - 192856 4099 - http://biorxiv.org/content/early/2017/09/23/192856.short 4100 - http://biorxiv.org/content/early/2017/09/23/192856.full AB - We describe a methodology for automatically extracting ‘evidence fragments’ from a set of biomedical experimental research articles. These fragments provide the primary description of evidence that is presented in the papers’ figures. They elucidate the goals, methods, results and interpretations of experiments that support the original scientific contributions the study being reported. Within this paper, we describe our methodology and showcase an example data set based on the European Bioinformatics Institute’s INTACT database (http://www.ebi.ac.uk/intact/). Using figure codes as anchors, we linked evidence fragments to INTACT data records as an example of distant supervision so that we could use INTACT’s preexisting, manually-curated structured interaction data to act as a gold standard for machine reading experiments. We report preliminary baseline event extraction measures from this collection based on a publicly available, machine reading system (REACH). We use semantic web standards for our data and provide open access to all source code.