Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

LIQA: Long-read Isoform Quantification and Analysis

View ORCID ProfileYu Hu, View ORCID ProfileLi Fang, Xuelian Chen, Jiang F. Zhong, Mingyao Li, View ORCID ProfileKai Wang
doi: https://doi.org/10.1101/2020.09.09.289793
Yu Hu
1Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Yu Hu
Li Fang
1Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Li Fang
Xuelian Chen
2Department of Otolaryngology, Keck School of Medicine, University of Southern California, CA 90033, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jiang F. Zhong
2Department of Otolaryngology, Keck School of Medicine, University of Southern California, CA 90033, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mingyao Li
3Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kai Wang
1Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
4Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Kai Wang
  • For correspondence: wangk@email.chop.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Long-read RNA sequencing (RNA-seq) technologies have made it possible to sequence full-length transcripts, facilitating the exploration of isoform-specific gene expression (isoform relative abundance and isoform-level TPM) over conventional short-read RNA-seq. However, long-read RNA-seq suffers from high per-base error rate, presence of chimeric reads or alternative alignments, and other biases, which require different analysis methods than short-read RNA-seq. Here we present LIQA (Long-read Isoform Quantification and Analysis), an Expectation-Maximization based statistical method to quantify isoform expression and detect differential alternative splicing (DAS) events using long-read RNA-seq data. Rather than summarizing isoform-specific read counts directly as done in short-read methods, LIQA incorporates base-pair quality score and isoform-specific read length information to assign different weights across reads, which reflects alignment confidence. Moreover, LIQA can detect DAS events between conditions using isoform usage estimates. We evaluated LIQA’s performance on simulated data and demonstrated that it outperforms other approaches in characterizing isoforms with low read coverage and in detecting DAS events between two groups. We also generated one direct mRNA sequencing dataset and one cDNA sequencing dataset using the Oxford Nanopore long-read platform, both with paired short-read RNA-seq data and qPCR data on selected genes, and we demonstrated that LIQA performs well in isoform discovery and quantification. Finally, we evaluated LIQA on a PacBio dataset on esophageal squamous epithelial cells, and demonstrated that LIQA recovered DAS events that failed to be detected in short-read data. In summary, LIQA leverages the power of long-read RNA-seq and achieves higher accuracy in estimating isoform abundance than existing approaches, especially for isoforms with low coverage and biased read distribution. LIQA is freely available at https://github.com/WGLab/LIQA.

Competing Interest Statement

The authors have declared no competing interest.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.
Back to top
PreviousNext
Posted January 19, 2021.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
LIQA: Long-read Isoform Quantification and Analysis
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
LIQA: Long-read Isoform Quantification and Analysis
Yu Hu, Li Fang, Xuelian Chen, Jiang F. Zhong, Mingyao Li, Kai Wang
bioRxiv 2020.09.09.289793; doi: https://doi.org/10.1101/2020.09.09.289793
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
LIQA: Long-read Isoform Quantification and Analysis
Yu Hu, Li Fang, Xuelian Chen, Jiang F. Zhong, Mingyao Li, Kai Wang
bioRxiv 2020.09.09.289793; doi: https://doi.org/10.1101/2020.09.09.289793

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4658)
  • Biochemistry (10313)
  • Bioengineering (7636)
  • Bioinformatics (26241)
  • Biophysics (13481)
  • Cancer Biology (10650)
  • Cell Biology (15361)
  • Clinical Trials (138)
  • Developmental Biology (8464)
  • Ecology (12776)
  • Epidemiology (2067)
  • Evolutionary Biology (16794)
  • Genetics (11373)
  • Genomics (15431)
  • Immunology (10580)
  • Microbiology (25087)
  • Molecular Biology (10172)
  • Neuroscience (54233)
  • Paleontology (398)
  • Pathology (1660)
  • Pharmacology and Toxicology (2884)
  • Physiology (4326)
  • Plant Biology (9213)
  • Scientific Communication and Education (1582)
  • Synthetic Biology (2545)
  • Systems Biology (6761)
  • Zoology (1459)