Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Phenotype to genotype mapping using supervised and unsupervised learning

View ORCID ProfileVito Paolo Pastore, Ashwini Oke, Sara Capponi, Daniel Elnatan, Jennifer Fung, Simone Bianco
doi: https://doi.org/10.1101/2022.03.17.484826
Vito Paolo Pastore
1Malga-Dibris, Universitá degli studi di Genova, Genoa, Italy
4NSF Center for Cellular Construction, San Francisco, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Vito Paolo Pastore
  • For correspondence: Vito.Paolo.Pastore@unige.it
Ashwini Oke
2Department of Obstetrics, Gynecology and Reproductive Sciences, University of California, San Francisco, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sara Capponi
3Almaden Research Center, IBM, San Jose, CA, USA
4NSF Center for Cellular Construction, San Francisco, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Daniel Elnatan
2Department of Obstetrics, Gynecology and Reproductive Sciences, University of California, San Francisco, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jennifer Fung
2Department of Obstetrics, Gynecology and Reproductive Sciences, University of California, San Francisco, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Simone Bianco
3Almaden Research Center, IBM, San Jose, CA, USA
4NSF Center for Cellular Construction, San Francisco, CA, USA
5Altos Labs, Redwood city, CA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

The relationship between the genotype, the genetic instructions encoded into a genome, and phenotype, the macroscopic realization of such instructions, remains mostly uncharted. In addition, tools able to uncover the connection between the phenotype with a specific set of responsible genes are still under definition. In this work, we focus on yeast organelles called vacuoles, which are cell membrane compartments that vary size and shape in response to various stimuli, and we develop a framework relating changes of cellular morphology to genetic modification. The method is a combination of convolutional neural network (CNN) and an unsupervised learning pipeline, which employs a deep-learning based segmentation, classification, and anomaly detection algorithm. From the live 3D fluorescence vacuole images, we observe that different genetic mutations generate distinct vacuole phenotypes and that the same mutation might correspond to more than one vacuole morphology. We trained a Unet architecture to segment our cellular images and obtain precise, quantitative information in 2D depth-encoded images. We then used an unsupervised learning approach to cluster the vacuole types and to establish a correlation between genotype and vacuole morphology. Using this procedure, we obtained 4 phenotypic groups. We extracted a set of 131 morphological features from the segmented vacuoles images, reduced to 50 after a tree-based feature selection. We obtained a purity of 85% adopting a Fuzzy K-Means based algorithm on a random subset of 880 images, containing all the detected phenotypic groups. Finally, we trained a CNN on the labels assigned during clustering. The CNN has been used for prediction of a large dataset (6942 images) with high accuracy (80%). Our approach can be applied extensively for live fluorescence image analysis and most importantly can unveil the basic principles relating genotype to vacuole phenotype in yeast cell, which can be thought as a first step for inferring cell designing principles to generate organelles with a specific, desired morphology.

Competing Interest Statement

The authors have declared no competing interest.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-ND 4.0 International license.
Back to top
PreviousNext
Posted March 19, 2022.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Phenotype to genotype mapping using supervised and unsupervised learning
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Phenotype to genotype mapping using supervised and unsupervised learning
Vito Paolo Pastore, Ashwini Oke, Sara Capponi, Daniel Elnatan, Jennifer Fung, Simone Bianco
bioRxiv 2022.03.17.484826; doi: https://doi.org/10.1101/2022.03.17.484826
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Phenotype to genotype mapping using supervised and unsupervised learning
Vito Paolo Pastore, Ashwini Oke, Sara Capponi, Daniel Elnatan, Jennifer Fung, Simone Bianco
bioRxiv 2022.03.17.484826; doi: https://doi.org/10.1101/2022.03.17.484826

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4237)
  • Biochemistry (9147)
  • Bioengineering (6786)
  • Bioinformatics (24025)
  • Biophysics (12137)
  • Cancer Biology (9545)
  • Cell Biology (13795)
  • Clinical Trials (138)
  • Developmental Biology (7642)
  • Ecology (11716)
  • Epidemiology (2066)
  • Evolutionary Biology (15518)
  • Genetics (10650)
  • Genomics (14332)
  • Immunology (9493)
  • Microbiology (22858)
  • Molecular Biology (9103)
  • Neuroscience (49032)
  • Paleontology (355)
  • Pathology (1484)
  • Pharmacology and Toxicology (2572)
  • Physiology (3849)
  • Plant Biology (8338)
  • Scientific Communication and Education (1472)
  • Synthetic Biology (2296)
  • Systems Biology (6196)
  • Zoology (1302)