Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Compositional Data Analysis using Kernels in Mass Cytometry Data

View ORCID ProfilePratyaydipta Rudra, Ryan Baxter, Elena WY Hsieh, Debashis Ghosh
doi: https://doi.org/10.1101/2021.05.08.443265
Pratyaydipta Rudra
1Department of Statistics, Oklahoma State University, Stillwater, OK 74078
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Pratyaydipta Rudra
Ryan Baxter
2Department of Immunology and Microbiology, University of Colorado Anschutz Medical Campus, Aurora, CO 80045
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Elena WY Hsieh
2Department of Immunology and Microbiology, University of Colorado Anschutz Medical Campus, Aurora, CO 80045
3Department of Pediatrics, Section of Allergy and Immunology, University of Colorado Anschutz Medical Campus, Aurora, CO 80045
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Debashis Ghosh
4Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO 80045
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Motivation Cell type abundance data arising from mass cytometry experiments are compositional in nature. Classical association tests do not apply to the compositional data due to their non-Euclidean nature. Existing methods for analysis of cell type abundance data suffer from several limitations for high-dimensional mass cytometry data, especially when the sample size is small.

Results We proposed a new multivariate statistical learning methodology, Compositional Data Analysis using Kernels (CODAK), based on the kernel distance covariance (KDC) framework to test the association of the cell type compositions with important predictors (categorical or continuous) such as disease status. CODAK scales well for high-dimensional data and provides satisfactory performance for small sample sizes (n < 25). We conducted simulation studies to compare the performance of the method with existing methods of analyzing cell type abundance data from mass cytometry studies. The method is also applied to a high-dimensional dataset containing different subgroups of populations including Systemic Lupus Erythematosus (SLE) patients and healthy control subjects.

Availability and Implementation CODAK is implemented using R. The codes and the data used in this manuscript are available on the web at http://github.com/GhoshLab/CODAK/.

Supplementary information Supplementary Materials.pdf.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

  • https://github.com/Ghoshlab/CODAK

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.
Back to top
PreviousNext
Posted May 10, 2021.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Compositional Data Analysis using Kernels in Mass Cytometry Data
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Compositional Data Analysis using Kernels in Mass Cytometry Data
Pratyaydipta Rudra, Ryan Baxter, Elena WY Hsieh, Debashis Ghosh
bioRxiv 2021.05.08.443265; doi: https://doi.org/10.1101/2021.05.08.443265
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
Compositional Data Analysis using Kernels in Mass Cytometry Data
Pratyaydipta Rudra, Ryan Baxter, Elena WY Hsieh, Debashis Ghosh
bioRxiv 2021.05.08.443265; doi: https://doi.org/10.1101/2021.05.08.443265

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (3607)
  • Biochemistry (7581)
  • Bioengineering (5529)
  • Bioinformatics (20809)
  • Biophysics (10338)
  • Cancer Biology (7988)
  • Cell Biology (11647)
  • Clinical Trials (138)
  • Developmental Biology (6611)
  • Ecology (10217)
  • Epidemiology (2065)
  • Evolutionary Biology (13630)
  • Genetics (9550)
  • Genomics (12854)
  • Immunology (7925)
  • Microbiology (19555)
  • Molecular Biology (7668)
  • Neuroscience (42147)
  • Paleontology (308)
  • Pathology (1258)
  • Pharmacology and Toxicology (2203)
  • Physiology (3269)
  • Plant Biology (7051)
  • Scientific Communication and Education (1294)
  • Synthetic Biology (1952)
  • Systems Biology (5429)
  • Zoology (1119)