Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Relating enhancer genetic variation across mammals to complex phenotypes using machine learning

View ORCID ProfileIrene M. Kaplow, View ORCID ProfileAlyssa J. Lawler, View ORCID ProfileDaniel E. Schäffer, View ORCID ProfileChaitanya Srinivasan, View ORCID ProfileMorgan E. Wirthlin, View ORCID ProfileBaDoi N. Phan, Xiaomeng Zhang, Kathleen Foley, Kavya Prasad, View ORCID ProfileAshley R. Brown, Zoonomia Consortium, View ORCID ProfileWynn K. Meyer, View ORCID ProfileAndreas R. Pfenning
doi: https://doi.org/10.1101/2022.08.26.505436
Irene M. Kaplow
1Departments of Computational Biology, Carnegie Mellon University
3Departments of Neuroscience Institute, Carnegie Mellon University
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Irene M. Kaplow
  • For correspondence: apfenning@cmu.edu ikaplow@cs.cmu.edu
Alyssa J. Lawler
2Departments of Biology, Carnegie Mellon University
3Departments of Neuroscience Institute, Carnegie Mellon University
4Broad Institute, Massachusetts Institute of Technology
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Alyssa J. Lawler
Daniel E. Schäffer
1Departments of Computational Biology, Carnegie Mellon University
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Daniel E. Schäffer
Chaitanya Srinivasan
1Departments of Computational Biology, Carnegie Mellon University
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Chaitanya Srinivasan
Morgan E. Wirthlin
1Departments of Computational Biology, Carnegie Mellon University
3Departments of Neuroscience Institute, Carnegie Mellon University
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Morgan E. Wirthlin
BaDoi N. Phan
1Departments of Computational Biology, Carnegie Mellon University
3Departments of Neuroscience Institute, Carnegie Mellon University
5Medical Scientist Training Program, University of Pittsburgh School of Medicine
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for BaDoi N. Phan
Xiaomeng Zhang
1Departments of Computational Biology, Carnegie Mellon University
4Broad Institute, Massachusetts Institute of Technology
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kathleen Foley
6Department of Biological Sciences, Lehigh University
7College of Law, University of Iowa
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kavya Prasad
1Departments of Computational Biology, Carnegie Mellon University
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ashley R. Brown
1Departments of Computational Biology, Carnegie Mellon University
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Ashley R. Brown
1Departments of Computational Biology, Carnegie Mellon University
Wynn K. Meyer
6Department of Biological Sciences, Lehigh University
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Wynn K. Meyer
Andreas R. Pfenning
1Departments of Computational Biology, Carnegie Mellon University
2Departments of Biology, Carnegie Mellon University
3Departments of Neuroscience Institute, Carnegie Mellon University
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Andreas R. Pfenning
  • For correspondence: apfenning@cmu.edu ikaplow@cs.cmu.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Protein-coding differences between mammals often fail to explain phenotypic diversity, suggesting involvement of enhancers, often rapidly evolving regions that regulate gene expression. Identifying associations between enhancers and phenotypes is challenging because enhancer activity is context-dependent and may be conserved without much sequence conservation. We developed TACIT (Tissue-Aware Conservation Inference Toolkit) to associate open chromatin regions (OCRs) with phenotypes using predictions in hundreds of mammalian genomes from machine learning models trained to learn tissue-specific regulatory codes. Applying TACIT for motor cortex and parvalbumin-positive interneurons to neurological phenotypes revealed dozens of new OCR-phenotype associations. Many associated OCRs were near relevant genes, including brain size-associated OCRs near genes mutated in microcephaly or macrocephaly. Our work creates a forward genomics foundation for identifying candidate enhancers associated with phenotype evolution.

One Sentence Summary A new machine learning-based approach associates enhancers with the evolution of brain size and behavior across mammals.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

  • ↵* These authors contributed equally to this work.

  • https://github.com/pfenninglab/TACIT

  • http://daphne.compbio.cs.cmu.edu/files/ikaplow/TACITSupplement/

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC 4.0 International license.
Back to top
PreviousNext
Posted August 26, 2022.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Relating enhancer genetic variation across mammals to complex phenotypes using machine learning
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Relating enhancer genetic variation across mammals to complex phenotypes using machine learning
Irene M. Kaplow, Alyssa J. Lawler, Daniel E. Schäffer, Chaitanya Srinivasan, Morgan E. Wirthlin, BaDoi N. Phan, Xiaomeng Zhang, Kathleen Foley, Kavya Prasad, Ashley R. Brown, Zoonomia Consortium, Wynn K. Meyer, Andreas R. Pfenning
bioRxiv 2022.08.26.505436; doi: https://doi.org/10.1101/2022.08.26.505436
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
Relating enhancer genetic variation across mammals to complex phenotypes using machine learning
Irene M. Kaplow, Alyssa J. Lawler, Daniel E. Schäffer, Chaitanya Srinivasan, Morgan E. Wirthlin, BaDoi N. Phan, Xiaomeng Zhang, Kathleen Foley, Kavya Prasad, Ashley R. Brown, Zoonomia Consortium, Wynn K. Meyer, Andreas R. Pfenning
bioRxiv 2022.08.26.505436; doi: https://doi.org/10.1101/2022.08.26.505436

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genomics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4091)
  • Biochemistry (8772)
  • Bioengineering (6487)
  • Bioinformatics (23356)
  • Biophysics (11756)
  • Cancer Biology (9154)
  • Cell Biology (13257)
  • Clinical Trials (138)
  • Developmental Biology (7418)
  • Ecology (11376)
  • Epidemiology (2066)
  • Evolutionary Biology (15095)
  • Genetics (10403)
  • Genomics (14014)
  • Immunology (9126)
  • Microbiology (22070)
  • Molecular Biology (8783)
  • Neuroscience (47395)
  • Paleontology (350)
  • Pathology (1421)
  • Pharmacology and Toxicology (2482)
  • Physiology (3705)
  • Plant Biology (8054)
  • Scientific Communication and Education (1433)
  • Synthetic Biology (2211)
  • Systems Biology (6017)
  • Zoology (1250)