TY - JOUR T1 - Reproducible, flexible and high throughput data extraction from primary literature: The metaDigitise <kbd>R</kbd> package JF - bioRxiv DO - 10.1101/247775 SP - 247775 AU - Joel L. Pick AU - Shinichi Nakagawa AU - Daniel W.A. Noble Y1 - 2018/01/01 UR - http://biorxiv.org/content/early/2018/01/15/247775.abstract N2 - Research synthesis, especially in the form of meta-analysis, requires data extraction from primary studies. Meta-analysis synthesizes effect sizes, often calculated from summary statistics of studies. However, exact values of such statistics are commonly hidden in figures. The R package metaDigitise extracts descriptive statistics such as means, standard deviations and, if applicable, correlations from the four types of plots: 1) mean and error plots (e.g. bar graphs with standard errors), 2) box plots, 3) scatter plots and 4) histograms. The package interactively guides the user through data extraction process. Notably, it enables a large-scale extraction using image files, letting the user stop processing, edit and add to the resulting data fame at any point. Further, it facilitates reproducible data extraction from plots with little inter-observer bias, thus, allowing a group of people to participate the extraction of data collaboratively. ER -