Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Sensitive and reproducible cell-free methylome quantification with synthetic spike-in controls

View ORCID ProfileSamantha L. Wilson, View ORCID ProfileShu Yi Shen, View ORCID ProfileLauren Harmon, View ORCID ProfileJustin M. Burgener, View ORCID ProfileTim Triche Jr., View ORCID ProfileScott V. Bratman, View ORCID ProfileDaniel D. De Carvalho, View ORCID ProfileMichael M. Hoffman
doi: https://doi.org/10.1101/2021.02.12.430289
Samantha L. Wilson
1Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Samantha L. Wilson
Shu Yi Shen
1Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Shu Yi Shen
Lauren Harmon
2Van Andel Institute, Grand Rapids, MI, United States of America
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Lauren Harmon
Justin M. Burgener
1Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
3Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Justin M. Burgener
Tim Triche Jr.
2Van Andel Institute, Grand Rapids, MI, United States of America
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Tim Triche Jr.
Scott V. Bratman
1Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
3Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Scott V. Bratman
Daniel D. De Carvalho
1Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
3Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Daniel D. De Carvalho
Michael M. Hoffman
1Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
3Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada
4Department of Computer Science, University of Toronto, Toronto, ON, Canada
5Vector Institute for Artificial Intelligence, Toronto, ON, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Michael M. Hoffman
  • For correspondence: michael.hoffman@utoronto.ca
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Background Cell-free methylated DNA immunoprecipitation-sequencing (cfMeDIP-seq) identifies genomic regions with DNA methylation, using a protocol adapted to work with low-input DNA samples and with cell-free DNA (cfDNA). This method allows for DNA methylation profiling of circulating tumour DNA in cancer patients’ blood samples. Such epigenetic profiling of circulating tumour DNA provides information about in which tissues tumour DNA originates, a key requirement of any test for early cancer detection. In addition, DNA methylation signatures provide prognostic information and can detect relapse. For robust quantitative comparisons between samples, immunoprecipitation enrichment methods like cfMeDIP-seq require normalization against common reference controls.

Methods To provide a simple and inexpensive reference for quantitative normalization, we developed a set of synthetic spike-in DNA controls for cfMeDIP-seq. These controls account for technical variation in enrichment efficiency due to biophysical properties of DNA fragments. Specifically, we designed 54 DNA fragments with combinations of methylation status (methylated and unmethylated), fragment length (80 bp, 160 bp, 320 bp), G+C content (35%, 50%, 65%), and fraction of CpG dinucleotides within the fragment (1/80 bp, 1/40 bp, 1/20 bp). We ensured that the spike-in synthetic DNA sequences do not align to the human genome. We integrated unique molecular indices (UMIs) into cfMeDIP-seq to control for differential amplification after enrichment. To assess enrichment bias according to distinct biophysical properties, we conducted cfMeDIP-seq solely on spike-in DNA fragments. To optimize the amount of spike-in DNA required, we added varying quantities of spike-in control DNA to sheared HCT116 colon cancer genomic DNA prior to cfMeDIP-seq. To assess batch effects, three separate labs conducted cfMeDIP-seq on peripheral blood plasma samples from acute myeloid leukemia (AML) patients.

Results We show that cfMeDIP-seq enriches for highly methylated regions, capturing ≥ 97% of methylated spike-in control fragments with ≤ 3% non-specific binding and preference for both high G+C content fragments and fragments with more CpGs. The use of 0.01 ng of spike-in control DNA in each sample provided sufficient sequencing reads to adjust for variance due to fragment length, G+C content, and CpG fraction. Using the known amount of each spiked-in fragment, we created a generalized linear model that absolutely quantifies molar amount from read counts across the genome, while adjusting for fragment length, G+C content, and CpG fraction. Employing our spike-in controls greatly mitigates batch effects, reducing batch-associated variance to ≤ 1% of the total variance within the data.

Discussion Incorporation of spike-in controls enables absolute quantification of methylated cfDNA generated from methylated DNA immunoprecipitation-sequencing (MeDIP-seq) experiments. It mitigates batch effects and corrects for biases in enrichment due to known biophysical properties of DNA fragments and other technical biases. We created an R package, spiky, to convert read counts to picomoles of DNA fragments, while adjusting for fragment properties that affect enrichment. The spiky package is available on Bioconductor (https://bioconductor.org/packages/spiky) and GitHub (https://github.com/trichelab/spiky).

Contact michael.hoffman{at}utoronto.ca

Competing Interest Statement

S.L.W., S.Y.S., T.T., D.D.De C., and M.M.H. are inventors on patent application PCT/CA2020/051507 related to the synthetic spike-in controls, licensed to Adela. S.V.B. and D.D.De C. are co-founders of and serve in leadership roles at Adela. S.Y.S., S.V.B., and D.D.De C. are inventors on other patent applications related to cell-free DNA methylation analysis technologies, licensed to Adela, and own equity in Adela. S.V.B. is inventor on a patent related to cell-free DNA mutation analysis technologies, licensed to Roche Molecular Diagnostics. S.V.B. and D.D.De C. have received research funding from Nektar Therapeutics.

Footnotes

  • In the first submission, we mislabeled a sample in the Lab 1 batch analysis, which results in one sample being duplicated and one being left out. We also took the opportunity to update our code and have now used umi-tools (v.1.0.0) to deduplicate our UMIs. We have fixed this error and updated all downstream analyses with the change in UMI deduplication, which makes slight changes to the model coefficients in Supplementary Table 2. We also changed the threshold for which regions were removed based on standard deviation, in order to be more stringent. We added detail about the kit used to prep samples for sequencing on the MiSeq sequencer. We corrected the amount of methylated filler added to the samples. Figures have been changed to reflect the stated changes. No overall results have changed.

  • https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE166259

  • https://github.com/trichelab/spiky

  • https://github.com/hoffmangroup/2020spikein

  • https://doi.org/10.5281/zenodo.4533340

  • https://ega-archive.org/studies/EGAS00001005069/

  • https://doi.org/10.5281/zenodo.4568265

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.
Back to top
PreviousNext
Posted November 19, 2021.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Sensitive and reproducible cell-free methylome quantification with synthetic spike-in controls
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Sensitive and reproducible cell-free methylome quantification with synthetic spike-in controls
Samantha L. Wilson, Shu Yi Shen, Lauren Harmon, Justin M. Burgener, Tim Triche Jr., Scott V. Bratman, Daniel D. De Carvalho, Michael M. Hoffman
bioRxiv 2021.02.12.430289; doi: https://doi.org/10.1101/2021.02.12.430289
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
Sensitive and reproducible cell-free methylome quantification with synthetic spike-in controls
Samantha L. Wilson, Shu Yi Shen, Lauren Harmon, Justin M. Burgener, Tim Triche Jr., Scott V. Bratman, Daniel D. De Carvalho, Michael M. Hoffman
bioRxiv 2021.02.12.430289; doi: https://doi.org/10.1101/2021.02.12.430289

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genomics
Subject Areas
All Articles
  • Animal Behavior and Cognition (3707)
  • Biochemistry (7835)
  • Bioengineering (5709)
  • Bioinformatics (21372)
  • Biophysics (10616)
  • Cancer Biology (8218)
  • Cell Biology (11990)
  • Clinical Trials (138)
  • Developmental Biology (6794)
  • Ecology (10435)
  • Epidemiology (2065)
  • Evolutionary Biology (13920)
  • Genetics (9736)
  • Genomics (13119)
  • Immunology (8183)
  • Microbiology (20092)
  • Molecular Biology (7886)
  • Neuroscience (43219)
  • Paleontology (322)
  • Pathology (1285)
  • Pharmacology and Toxicology (2270)
  • Physiology (3367)
  • Plant Biology (7263)
  • Scientific Communication and Education (1317)
  • Synthetic Biology (2012)
  • Systems Biology (5554)
  • Zoology (1136)