Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Robust inference of expression state in bulk and single-cell RNA-Seq using curated intergenic regions

View ORCID ProfileSara S. Fonseca Costa, View ORCID ProfileMarta Rosikiewicz, View ORCID ProfileJulien Roux, View ORCID ProfileJulien Wollbrett, View ORCID ProfileFrederic B. Bastian, View ORCID ProfileMarc Robinson-Rechavi
doi: https://doi.org/10.1101/2022.03.31.486555
Sara S. Fonseca Costa
1Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland
2SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Sara S. Fonseca Costa
Marta Rosikiewicz
3SOPHiA GENETICS, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Marta Rosikiewicz
Julien Roux
2SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
4Bioinformatics Core Facility, Department of Biomedicine, University of Basel, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Julien Roux
Julien Wollbrett
1Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland
2SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Julien Wollbrett
Frederic B. Bastian
1Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland
2SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Frederic B. Bastian
Marc Robinson-Rechavi
1Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland
2SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Marc Robinson-Rechavi
  • For correspondence: marc.robinson-rechavi@unil.ch
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

RNA-Seq is a powerful technique to provide quantitative information on gene expression. While many applications focus on estimated expression levels, it is also important to determine which genes are actively transcribed, and which are not. The problem can be viewed as simply setting a biologically meaningful threshold for calling a gene expressed. We propose to define this threshold per sample relative to the background level for non-expressed genomic features, inferred by the amount of reads mapped to intergenic regions of the genome. To this aim, we first define a stringent set of reference intergenic regions, based on available bulk RNA-Seq libraries for each species. We provide predefined regions selected for different animal species with varying genome annotation quality through the Bgee database. We then call genes expressed if their level of expression is significantly higher than the background noise. This approach can be applied to bulk as well as single-cell RNA-Seq, on a single library as well as on a combination of libraries over one condition. We show that the estimated proportion of expressed genes is biologically meaningful and stable between libraries originating from the same tissue, in both model and non-model organisms.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

  • https://github.com/BgeeDB/Methods_RNASeq_expression_calls

  • https://github.com/BgeeDB/BgeeCall/tree/calls_paper

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted April 01, 2022.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Robust inference of expression state in bulk and single-cell RNA-Seq using curated intergenic regions
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Robust inference of expression state in bulk and single-cell RNA-Seq using curated intergenic regions
Sara S. Fonseca Costa, Marta Rosikiewicz, Julien Roux, Julien Wollbrett, Frederic B. Bastian, Marc Robinson-Rechavi
bioRxiv 2022.03.31.486555; doi: https://doi.org/10.1101/2022.03.31.486555
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Robust inference of expression state in bulk and single-cell RNA-Seq using curated intergenic regions
Sara S. Fonseca Costa, Marta Rosikiewicz, Julien Roux, Julien Wollbrett, Frederic B. Bastian, Marc Robinson-Rechavi
bioRxiv 2022.03.31.486555; doi: https://doi.org/10.1101/2022.03.31.486555

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genomics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4232)
  • Biochemistry (9126)
  • Bioengineering (6774)
  • Bioinformatics (23985)
  • Biophysics (12116)
  • Cancer Biology (9520)
  • Cell Biology (13772)
  • Clinical Trials (138)
  • Developmental Biology (7626)
  • Ecology (11683)
  • Epidemiology (2066)
  • Evolutionary Biology (15502)
  • Genetics (10637)
  • Genomics (14318)
  • Immunology (9476)
  • Microbiology (22828)
  • Molecular Biology (9088)
  • Neuroscience (48947)
  • Paleontology (355)
  • Pathology (1480)
  • Pharmacology and Toxicology (2567)
  • Physiology (3844)
  • Plant Biology (8325)
  • Scientific Communication and Education (1471)
  • Synthetic Biology (2296)
  • Systems Biology (6185)
  • Zoology (1300)