Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

STRipy: a graphical application for enhanced genotyping of pathogenic short tandem repeats in sequencing data

View ORCID ProfileAndreas Halman, View ORCID ProfileEgor Dolzhenko, View ORCID ProfileAlicia Oshlack
doi: https://doi.org/10.1101/2021.06.13.448220
Andreas Halman
1Peter MacCallum Cancer Centre, Melbourne, Victoria 3000, Australia
2Sir Peter MacCallum Department of Oncology, The University of Melbourne, Victoria 3010, Australia
3Murdoch Children’s Research Institute, Royal Children’s Hospital, Parkville, Victoria 3052, Australia
4Florey Department of Neuroscience and Mental Health, The University of Melbourne, Parkville, Victoria 3010, Australia
5School of Natural Sciences and Health, Tallinn University, 10120 Tallinn, Estonia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Andreas Halman
Egor Dolzhenko
6Illumina Inc., 5200 Illumina Way, San Diego, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Egor Dolzhenko
Alicia Oshlack
1Peter MacCallum Cancer Centre, Melbourne, Victoria 3000, Australia
2Sir Peter MacCallum Department of Oncology, The University of Melbourne, Victoria 3010, Australia
7School of BioSciences, University of Melbourne, Parkville, Victoria 3052, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Alicia Oshlack
  • For correspondence: alicia.oshlack@petermac.org
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Short tandem repeats (STRs) are highly polymorphic with high mutation rates and expansions of STRs have been implicated as the causal variant in diseases. The application of genome sequencing in patients has recently allowed many new discoveries with over 50 disease causing loci known to date. There are several tools which allow genotyping of STRs from high-throughput sequencing (HTS) data. However, running these tools out of the box only allow around half of the known disease-causing loci to be genotyped, with lengths often limited to either read or fragment length which is less than the pathogenic cut-off for some diseases. While analysis tools can be customised to genotype extra loci, this requires proficiency in bioinformatics to set up, use, and analyse the resulting data, limiting their widespread usage by other researchers and clinicians.

To address these issues, we have created a new software called STRipy that has an intuitive graphical interface and requires no specific skills for usage, thus significantly simplifying detection of STRs expansions from human HTS data. STRipy is able to target all known disease-causing STRs with genotyping performed with an established tool, ExpansionHunter, that is incorporated into the software. We have created additional functionality into STRipy to work with long alleles exceeding the fragment length.

STRipy was validated using over 60 thousand simulated samples and was shown to work on whole genome sequencing of biological samples with pathogenic variants. Finally, we have used STRipy to acquire genotypes of pathogenic loci for thousands of samples from various populations which are provided to the user along with the data from the literature to assist with results interpretation. We believe the simplicity and breadth of STRipy will increase the testing of STR diseases in current datasets resulting in further diagnoses of rare diseases caused by STRs expansions.

Competing Interest Statement

Egor Dolzhenko is an employee of Illumina, Inc., a public company that develops and markets systems for genetic analysis.

Footnotes

  • https://stripy.org/database

  • https://gitlab.com/andreassh/stripy-client

  • https://gitlab.com/andreassh/stripy-server

  • https://doi.org/10.5281/zenodo.4939980

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC 4.0 International license.
Back to top
PreviousNext
Posted June 14, 2021.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
STRipy: a graphical application for enhanced genotyping of pathogenic short tandem repeats in sequencing data
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
STRipy: a graphical application for enhanced genotyping of pathogenic short tandem repeats in sequencing data
Andreas Halman, Egor Dolzhenko, Alicia Oshlack
bioRxiv 2021.06.13.448220; doi: https://doi.org/10.1101/2021.06.13.448220
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
STRipy: a graphical application for enhanced genotyping of pathogenic short tandem repeats in sequencing data
Andreas Halman, Egor Dolzhenko, Alicia Oshlack
bioRxiv 2021.06.13.448220; doi: https://doi.org/10.1101/2021.06.13.448220

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4232)
  • Biochemistry (9128)
  • Bioengineering (6774)
  • Bioinformatics (23989)
  • Biophysics (12117)
  • Cancer Biology (9523)
  • Cell Biology (13772)
  • Clinical Trials (138)
  • Developmental Biology (7627)
  • Ecology (11686)
  • Epidemiology (2066)
  • Evolutionary Biology (15504)
  • Genetics (10638)
  • Genomics (14322)
  • Immunology (9477)
  • Microbiology (22831)
  • Molecular Biology (9089)
  • Neuroscience (48960)
  • Paleontology (355)
  • Pathology (1480)
  • Pharmacology and Toxicology (2568)
  • Physiology (3844)
  • Plant Biology (8327)
  • Scientific Communication and Education (1471)
  • Synthetic Biology (2296)
  • Systems Biology (6186)
  • Zoology (1300)