Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

A methodology for unsupervised clustering using iterative pruning to capture fine-scale structure

View ORCID ProfileKridsadakorn Chaichoompu, View ORCID ProfileFentaw Abegaz Yazew, View ORCID ProfileSissades Tongsima, Philip James Shaw, View ORCID ProfileAnavaj Sakuntabhai, View ORCID ProfileBruno Cavadas, View ORCID ProfileLuísa Pereira, Kristel Van Steen
doi: https://doi.org/10.1101/234989
Kridsadakorn Chaichoompu
1GIGA-Medical Genomics, University of Liege, Avenue de l’Hôpital 11, 4000 Liege, Belgium
2Statistical Genetics, Max Planck Institute of Psychiatry, Munich, 80804, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Kridsadakorn Chaichoompu
Fentaw Abegaz Yazew
1GIGA-Medical Genomics, University of Liege, Avenue de l’Hôpital 11, 4000 Liege, Belgium
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Fentaw Abegaz Yazew
Sissades Tongsima
3Genome Technology Research Unit, National Center for Genetic Engineering and Biotechnology, 113 Thailand Science Park, Phahonyothin Road, Khlong Neung, Khlong Luang, Pathum Thani 12120, Thailand
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Sissades Tongsima
Philip James Shaw
4Medical Molecular Biology Research Unit, National Center for Genetic Engineering and Biotechnology, 113 Thailand Science Park, Phahonyothin Road, Khlong Neung, Khlong Luang, Pathum Thani 12120, Thailand
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Anavaj Sakuntabhai
5Functional Genetics of Infectious Diseases Unit, Institut Pasteur, 25-28, rue du Docteur Roux, 75015 Paris, France
6Centre National de la Recherche Scientifique, URA3012, Paris, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Anavaj Sakuntabhai
Bruno Cavadas
7Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Rua Alfredo Allen, 208 | 4200-135 Porto, Portugal
8Instituto de Patologia e Imunologia Molecular da Universidade do Porto, Rua Júlio Amaral de Carvalho, 45 | 4200-135 Porto, Portugal
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Bruno Cavadas
Luísa Pereira
7Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Rua Alfredo Allen, 208 | 4200-135 Porto, Portugal
8Instituto de Patologia e Imunologia Molecular da Universidade do Porto, Rua Júlio Amaral de Carvalho, 45 | 4200-135 Porto, Portugal
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Luísa Pereira
Kristel Van Steen
1GIGA-Medical Genomics, University of Liege, Avenue de l’Hôpital 11, 4000 Liege, Belgium
9WELBIO (Walloon Excellence in Lifesciences and Biotechnology)
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

SNP-based information is used in several existing clustering methods to detect shared genetic ancestry or to identify population substructure. Here, we present a methodology for unsupervised clustering using iterative pruning to capture fine-scale structure called IPCAPS. Our method supports ordinal data which can be applied directly to SNP data to identify fine-scale population structure. We compare our method to existing tools for detecting fine-scale structure via simulations. The simulated data do not take into account haplotype information, therefore all markers are independent. Although haplotypes may be more informative than SNPs, especially in fine-scale detection analyses, the haplotype inference process often remains too computationally intensive. Therefore, our strategy has been to restrict attention to SNPs and to investigate the scale of the structure we are able to detect with them. We show that the experimental results in simulated data can be highly accurate and an improvement to existing tools. We are convinced that our method has a potential to detect fine-scale structure.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.
Back to top
PreviousNext
Posted December 15, 2017.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
A methodology for unsupervised clustering using iterative pruning to capture fine-scale structure
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
A methodology for unsupervised clustering using iterative pruning to capture fine-scale structure
Kridsadakorn Chaichoompu, Fentaw Abegaz Yazew, Sissades Tongsima, Philip James Shaw, Anavaj Sakuntabhai, Bruno Cavadas, Luísa Pereira, Kristel Van Steen
bioRxiv 234989; doi: https://doi.org/10.1101/234989
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
A methodology for unsupervised clustering using iterative pruning to capture fine-scale structure
Kridsadakorn Chaichoompu, Fentaw Abegaz Yazew, Sissades Tongsima, Philip James Shaw, Anavaj Sakuntabhai, Bruno Cavadas, Luísa Pereira, Kristel Van Steen
bioRxiv 234989; doi: https://doi.org/10.1101/234989

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4855)
  • Biochemistry (10798)
  • Bioengineering (8045)
  • Bioinformatics (27310)
  • Biophysics (13987)
  • Cancer Biology (11127)
  • Cell Biology (16062)
  • Clinical Trials (138)
  • Developmental Biology (8788)
  • Ecology (13294)
  • Epidemiology (2067)
  • Evolutionary Biology (17364)
  • Genetics (11689)
  • Genomics (15925)
  • Immunology (11034)
  • Microbiology (26093)
  • Molecular Biology (10654)
  • Neuroscience (56568)
  • Paleontology (418)
  • Pathology (1732)
  • Pharmacology and Toxicology (3005)
  • Physiology (4547)
  • Plant Biology (9630)
  • Scientific Communication and Education (1615)
  • Synthetic Biology (2689)
  • Systems Biology (6979)
  • Zoology (1510)