PT - JOURNAL ARTICLE AU - Pick, Joel L. AU - Nakagawa, Shinichi AU - Noble, Daniel W.A. TI - Reproducible, flexible and high throughput data extraction from primary literature: The metaDigitise <kbd>R</kbd> package AID - 10.1101/247775 DP - 2018 Jan 01 TA - bioRxiv PG - 247775 4099 - http://biorxiv.org/content/early/2018/01/15/247775.short 4100 - http://biorxiv.org/content/early/2018/01/15/247775.full AB - Research synthesis, especially in the form of meta-analysis, requires data extraction from primary studies. Meta-analysis synthesizes effect sizes, often calculated from summary statistics of studies. However, exact values of such statistics are commonly hidden in figures. The R package metaDigitise extracts descriptive statistics such as means, standard deviations and, if applicable, correlations from the four types of plots: 1) mean and error plots (e.g. bar graphs with standard errors), 2) box plots, 3) scatter plots and 4) histograms. The package interactively guides the user through data extraction process. Notably, it enables a large-scale extraction using image files, letting the user stop processing, edit and add to the resulting data fame at any point. Further, it facilitates reproducible data extraction from plots with little inter-observer bias, thus, allowing a group of people to participate the extraction of data collaboratively.