Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Machine Classification of Methylomes in Cancer

View ORCID ProfileIsabelle Newsham, Marcin Sendera, SriGanesh Jammula, Rebecca Fitzgerald, Charles Massie, View ORCID ProfileShamith A. Samarajiwa
doi: https://doi.org/10.1101/2020.04.04.025155
Isabelle Newsham
1MRC Cancer Unit, Cambridge Biomedical Campus, Box 197, University of Cambridge, Cambridge, CB2 0XZ, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Isabelle Newsham
Marcin Sendera
1MRC Cancer Unit, Cambridge Biomedical Campus, Box 197, University of Cambridge, Cambridge, CB2 0XZ, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
SriGanesh Jammula
1MRC Cancer Unit, Cambridge Biomedical Campus, Box 197, University of Cambridge, Cambridge, CB2 0XZ, UK
3CRUK Cambridge Institute, University of Cambridge, Robinson Way, CB2 ORE, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Rebecca Fitzgerald
1MRC Cancer Unit, Cambridge Biomedical Campus, Box 197, University of Cambridge, Cambridge, CB2 0XZ, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Charles Massie
2Department of Oncology, Cambridge Biomedical Campus, Box 197, University of Cambridge, CB2 0XZ, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Shamith A. Samarajiwa
1MRC Cancer Unit, Cambridge Biomedical Campus, Box 197, University of Cambridge, Cambridge, CB2 0XZ, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Shamith A. Samarajiwa
  • For correspondence: ss861@mrc-cu.cam.ac.uk
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Cancer remains a leading cause of morbidity and mortality worldwide. Its evolutionary nature and resultant complex interactions with the tumour micro-environment and the host immune system engender heterogeneity, make developing interventions difficult. Usually detected at the advanced stages of disease, metastatic cancer accounts for 90% of cancer-associated deaths. Therefore early detection of cancer, combined with current therapies, would have a significant impact on survival and treatment of this insidious disease. Epigenetic changes such as DNA methylation are some of the early events in carcinogenesis. Here, we report on a machine learning model that can classify 13 types of cancer as well as non-cancer tissue samples using only DNA methylome data, with an accuracy of 98.2%. We utilise the features identified by this model to develop a robust deep neural network that can generalise to independent data sets. We also demonstrate that the methylation associated genomic loci detected by the classifier are associated with genes involved in cancer, providing insights into the epigenomic regulation of carcinogenesis.

  • 9 Abbreviations

    TCGA
    The Cancer Genome Atlas
    BLCA
    Bladder urothelial carcinoma
    BRCA
    Breast invasive carcinoma
    COAD
    Colon adenocarcinoma
    ESCA
    Esophageal carcinoma
    HNSC
    Head and neck squamous cell carcinoma
    KIRC
    Kidney renal clear cell carcinoma
    KIRP
    Kidney renal papillary cell carcinoma
    LIHC
    Liver hepatocellular carcinoma
    LUAD
    Lung adenocarcinoma
    LUSC
    Lung squamous cell carcinoma
    PRAD
    Prostate adenocarcinoma
    THCA
    Thyroid carcinoma
    UCEC
    Uterine corpus endometrial carcinoma
    AUC
    Area Under the Curve
    ROC
    Receiver Operating Characteristic
    MCC
    Matthews Correlation Coefficient
    UMAP
    Uniform manifold approximation and projection
  • Copyright 
    The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.
    Back to top
    PreviousNext
    Posted April 05, 2020.
    Download PDF

    Supplementary Material

    Email

    Thank you for your interest in spreading the word about bioRxiv.

    NOTE: Your email address is requested solely to identify you as the sender of this article.

    Enter multiple addresses on separate lines or separate them with commas.
    Machine Classification of Methylomes in Cancer
    (Your Name) has forwarded a page to you from bioRxiv
    (Your Name) thought you would like to see this page from the bioRxiv website.
    CAPTCHA
    This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
    Share
    Machine Classification of Methylomes in Cancer
    Isabelle Newsham, Marcin Sendera, SriGanesh Jammula, Rebecca Fitzgerald, Charles Massie, Shamith A. Samarajiwa
    bioRxiv 2020.04.04.025155; doi: https://doi.org/10.1101/2020.04.04.025155
    Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
    Citation Tools
    Machine Classification of Methylomes in Cancer
    Isabelle Newsham, Marcin Sendera, SriGanesh Jammula, Rebecca Fitzgerald, Charles Massie, Shamith A. Samarajiwa
    bioRxiv 2020.04.04.025155; doi: https://doi.org/10.1101/2020.04.04.025155

    Citation Manager Formats

    • BibTeX
    • Bookends
    • EasyBib
    • EndNote (tagged)
    • EndNote 8 (xml)
    • Medlars
    • Mendeley
    • Papers
    • RefWorks Tagged
    • Ref Manager
    • RIS
    • Zotero
    • Tweet Widget
    • Facebook Like
    • Google Plus One

    Subject Area

    • Bioinformatics
    Subject Areas
    All Articles
    • Animal Behavior and Cognition (4383)
    • Biochemistry (9602)
    • Bioengineering (7097)
    • Bioinformatics (24871)
    • Biophysics (12625)
    • Cancer Biology (9962)
    • Cell Biology (14362)
    • Clinical Trials (138)
    • Developmental Biology (7964)
    • Ecology (12112)
    • Epidemiology (2067)
    • Evolutionary Biology (15992)
    • Genetics (10929)
    • Genomics (14745)
    • Immunology (9871)
    • Microbiology (23681)
    • Molecular Biology (9486)
    • Neuroscience (50891)
    • Paleontology (369)
    • Pathology (1540)
    • Pharmacology and Toxicology (2683)
    • Physiology (4020)
    • Plant Biology (8657)
    • Scientific Communication and Education (1510)
    • Synthetic Biology (2397)
    • Systems Biology (6441)
    • Zoology (1346)