Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Leveraging a surrogate outcome to improve inference on a partially missing target outcome

View ORCID ProfileZachary R. McCaw, Sheila M. Gaynor, Ryan Sun, Xihong Lin
doi: https://doi.org/10.1101/2020.11.29.403063
Zachary R. McCaw
1Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Zachary R. McCaw
  • For correspondence: zrmacc@gmail.com
Sheila M. Gaynor
2Department of Biostatistics, MD Anderson Cancer Center, Houston, TX 77030
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ryan Sun
2Department of Biostatistics, MD Anderson Cancer Center, Houston, TX 77030
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Xihong Lin
1Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115
3Department of Statistics, Harvard University, Cambridge, MA 02138
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Sample sizes vary substantially across tissues in the Genotype-Tissue Expression (GTEx) project, where considerably fewer samples are available from certain inaccessible tissues, such as the substantia nigra (SSN), than from accessible tissues, such as blood. This severely limits power for identifying tissue-specific expression quantitative trait loci (eQTL) in undersampled tissues. Here we propose Surrogate Phenotype Regression Analysis (Spray) for leveraging information from a correlated surrogate outcome (e.g. expression in blood) to improve inference on a partially missing target outcome (e.g. expression in SSN). Rather than regarding the surrogate outcome as a proxy for the target outcome, Spray jointly models the target and surrogate outcomes within a bivariate regression framework. Unobserved values of either outcome are treated as missing data. We describe and implement an expectation conditional maximization algorithm for performing estimation in the presence of bilateral outcome missingness. Spray estimates the same association parameter estimated by standard eQTL mapping and controls the type I error even when the target and surrogate outcomes are truly uncorrelated. We demonstrate analytically and empirically, using simulations and GTEx data, that in comparison with marginally modeling the target outcome, jointly modeling the target and surrogate outcomes increases estimation precision and improves power.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

  • Extensive additional simulations have been added to the supporting information, and the data application to GTEx has been reworked.

  • https://github.com/zrmacc/SurrogateRegression

  • https://cran.r-project.org/web/packages/SurrogateRegression/index.html

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.
Back to top
PreviousNext
Posted January 20, 2022.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Leveraging a surrogate outcome to improve inference on a partially missing target outcome
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Leveraging a surrogate outcome to improve inference on a partially missing target outcome
Zachary R. McCaw, Sheila M. Gaynor, Ryan Sun, Xihong Lin
bioRxiv 2020.11.29.403063; doi: https://doi.org/10.1101/2020.11.29.403063
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Leveraging a surrogate outcome to improve inference on a partially missing target outcome
Zachary R. McCaw, Sheila M. Gaynor, Ryan Sun, Xihong Lin
bioRxiv 2020.11.29.403063; doi: https://doi.org/10.1101/2020.11.29.403063

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genetics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4381)
  • Biochemistry (9581)
  • Bioengineering (7086)
  • Bioinformatics (24844)
  • Biophysics (12597)
  • Cancer Biology (9952)
  • Cell Biology (14345)
  • Clinical Trials (138)
  • Developmental Biology (7944)
  • Ecology (12101)
  • Epidemiology (2067)
  • Evolutionary Biology (15984)
  • Genetics (10921)
  • Genomics (14735)
  • Immunology (9869)
  • Microbiology (23645)
  • Molecular Biology (9477)
  • Neuroscience (50838)
  • Paleontology (369)
  • Pathology (1539)
  • Pharmacology and Toxicology (2681)
  • Physiology (4013)
  • Plant Biology (8655)
  • Scientific Communication and Education (1508)
  • Synthetic Biology (2391)
  • Systems Biology (6427)
  • Zoology (1346)