Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

PHATE: A Dimensionality Reduction Method for Visualizing Trajectory Structures in High-Dimensional Biological Data

Kevin R. Moon, David van Dijk, Zheng Wang, William Chen, Matthew J. Hirn, Ronald R. Coifman, Natalia B. Ivanova, Guy Wolf, Smita Krishnaswamy
doi: https://doi.org/10.1101/120378
Kevin R. Moon
1Departments of Genetics;
2Applied Mathematics Program;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
David van Dijk
5Computational Biology Program, Memorial Sloan-Kettering Cancer Center, New York, NY, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Zheng Wang
4Yale Stem Cell Center, Department of Genetics, Yale University, New Haven, CT, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
William Chen
1Departments of Genetics;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Matthew J. Hirn
6Department of Computational Mathematics, Science and Engineering;
7Department of Mathematics, Michigan State University, East Lansing, MI, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ronald R. Coifman
2Applied Mathematics Program;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Natalia B. Ivanova
4Yale Stem Cell Center, Department of Genetics, Yale University, New Haven, CT, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: natalia.ivanova@yale.edu
Guy Wolf
2Applied Mathematics Program;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Smita Krishnaswamy
1Departments of Genetics;
3Department of Computer Science;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: smita.krishnaswamy@yale.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

In recent years, dimensionality reduction methods have become critical for visualization, exploration, and interpretation of high-throughput, high-dimensional biological data, as they enable the extraction of major trends in the data while discarding noise. However, biological data contains a type of predominant structure that is not preserved in commonly used methods such as PCA and tSNE, namely, branching progression structure. This structure, which is often non-linear, arises from underlying biological processes such as differentiation, graded responses to stimuli, and population drift, which generate cellular (or population) diversity. We propose a novel, affinity-preserving embedding called PHATE (Potential of Heat-diffusion for Affinity-based Trajectory Embedding), designed explicitly to preserve progression structure in data.

PHATE provides a denoised, two or three-dimensional visualization of the complete branching trajectory structure in high-dimensional data. It uses heat-diffusion processes, which naturally denoise the data, to compute cell-cell affinities. Then, PHATE creates a diffusion-potential geometry by free-energy potentials of these processes. This geometry captures high-dimensional trajectory structures, while enabling a natural embedding of the intrinsic data geometry. This embedding accurately visualizes trajectories and data distances, without requiring strict assumptions typically used by path-finding and tree-fitting algorithms, which have recently been used for pseudotime orderings or tree-renderings of cellular data. Furthermore, PHATE supports a wide range of data exploration tasks by providing interpretable overlays on top of the visualization. We show that such overlays can emphasize and reveal trajectory end-points, branch points and associated split-decisions, progression-forming variables (e.g., specific genes), and paths between developmental events in cellular state-space. We demonstrate PHATE on single-cell RNA sequencing and mass cytometry data pertaining to embryoid body differentiation, IPSC reprogramming, and hematopoiesis in the bone marrow. We also demonstrate PHATE on non-single cell data including single-nucleotide polymorphism (SNP) measurements of European populations, and 16s sequencing of gut microbiota.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted March 24, 2017.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
PHATE: A Dimensionality Reduction Method for Visualizing Trajectory Structures in High-Dimensional Biological Data
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
PHATE: A Dimensionality Reduction Method for Visualizing Trajectory Structures in High-Dimensional Biological Data
Kevin R. Moon, David van Dijk, Zheng Wang, William Chen, Matthew J. Hirn, Ronald R. Coifman, Natalia B. Ivanova, Guy Wolf, Smita Krishnaswamy
bioRxiv 120378; doi: https://doi.org/10.1101/120378
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
PHATE: A Dimensionality Reduction Method for Visualizing Trajectory Structures in High-Dimensional Biological Data
Kevin R. Moon, David van Dijk, Zheng Wang, William Chen, Matthew J. Hirn, Ronald R. Coifman, Natalia B. Ivanova, Guy Wolf, Smita Krishnaswamy
bioRxiv 120378; doi: https://doi.org/10.1101/120378

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (2540)
  • Biochemistry (4990)
  • Bioengineering (3492)
  • Bioinformatics (15261)
  • Biophysics (6922)
  • Cancer Biology (5415)
  • Cell Biology (7762)
  • Clinical Trials (138)
  • Developmental Biology (4550)
  • Ecology (7175)
  • Epidemiology (2059)
  • Evolutionary Biology (10252)
  • Genetics (7527)
  • Genomics (9818)
  • Immunology (4883)
  • Microbiology (13277)
  • Molecular Biology (5159)
  • Neuroscience (29537)
  • Paleontology (203)
  • Pathology (840)
  • Pharmacology and Toxicology (1469)
  • Physiology (2149)
  • Plant Biology (4772)
  • Scientific Communication and Education (1015)
  • Synthetic Biology (1340)
  • Systems Biology (4016)
  • Zoology (770)