PT - JOURNAL ARTICLE AU - Jackson Loper AU - Trygve Bakken AU - Uygar Sumbul AU - Gabe Murphy AU - Hongkui Zeng AU - David Blei AU - Liam Paninski TI - The Markov link method: a nonparametric approach to combine observations from multiple experiments AID - 10.1101/457283 DP - 2018 Jan 01 TA - bioRxiv PG - 457283 4099 - http://biorxiv.org/content/early/2018/10/31/457283.short 4100 - http://biorxiv.org/content/early/2018/10/31/457283.full AB - This paper studies measurement linkage. An example from cell biology helps explain the problem: imagine for a given cell we can either sequence the cell’s RNA or we can examine its morphology, but not both. Given a cell’s morphology, what do we expect to see in its RNA? Given a cell’s RNA, what do we expect in its morphology? More broadly, given a measurement of one type, can we predict measurements of the other type? This measurement linkage problem arises in many scientific and technological fields. To solve this problem, we develop a nonparametric approach we dub the “Markov link method” (MLM). The MLM makes a conditional independence assumption that holds in many multi-measurement contexts and provides a a way to estimate the link, the conditional probability of one type of measurement given the other. We derive conditions under which the MLM estimator is consistent and we use simulated data to show that it provides accurate measures of uncertainty. We evaluate the MLM on real data generated by a pair of single-cell RNA sequencing techniques. The MLM characterizes the link between them and helps connect the two notions of cell type derived from each technique. Further, the MLM reveals that some aspects of the link cannot be determined from the available data, and suggests new experiments that would allow for better estimates.