Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

The role and robustness of the Gini coefficient as an unbiased tool for the selection of Gini genes for normalising expression profiling data

Marina Wright Muelas, Farah Mughal, Steve O’Hagan, Philip J. Day, Douglas B. Kell
doi: https://doi.org/10.1101/718007
Marina Wright Muelas
1Department of Biochemistry, Institute of Integrative Biology, Faculty of Health and Life Sciences, University of Liverpool, Crown Street, Liverpool, L69 7ZB, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: m.wright-muelas@liverpool.ac.uk dbk@liv.ac.uk
Farah Mughal
1Department of Biochemistry, Institute of Integrative Biology, Faculty of Health and Life Sciences, University of Liverpool, Crown Street, Liverpool, L69 7ZB, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Steve O’Hagan
2School of Chemistry, 131, Princess St, Manchester M1 7DN, UK
3The Manchester Institute of Biotechnology, 131, Princess St, Manchester M1 7DN, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Philip J. Day
3The Manchester Institute of Biotechnology, 131, Princess St, Manchester M1 7DN, UK
4Faculty of Biology, Medicine and Health, The University of Manchester M13 9PL, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Douglas B. Kell
1Department of Biochemistry, Institute of Integrative Biology, Faculty of Health and Life Sciences, University of Liverpool, Crown Street, Liverpool, L69 7ZB, UK
5Novo Nordisk Foundation Centre for Biosustainability, Technical University of Denmark, 10 Building 220, Kemitorvet, 2800 Kgs. Lyngby, Denmark
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: m.wright-muelas@liverpool.ac.uk dbk@liv.ac.uk
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

We recently introduced the Gini coefficient (GC) for assessing the expression variation of a particular gene in a dataset, as a means of selecting improved reference genes over the cohort (‘housekeeping genes’) typically used for normalisation in expression profiling studies. Those genes (transcripts) that we determined to be useable as reference genes differed greatly from previous suggestions based on hypothesis-driven approaches. A limitation of this initial study is that a single (albeit large) dataset was employed for both tissues and cell lines.

We here extend this analysis to encompass seven other large datasets. Although their absolute values differ a little, the Gini values and median expression levels of the various genes are well correlated with each other between the various cell line datasets, implying that our original choice of the more ubiquitously expressed low-Gini-coefficient genes was indeed sound. In tissues, the Gini values and median expression levels of genes showed a greater variation, with the GC of genes changing with the number and types of tissues in the data sets. In all data sets, regardless of whether this was derived from tissues or cell lines, we also show that the GC is a robust measure of gene expression stability. Using the GC as a measure of expression stability we illustrate its utility to find tissue- and cell line-optimised housekeeping genes without any prior bias, that again include only a small number of previously reported housekeeping genes. We also independently confirmed this experimentally using RT-qPCR with 40 candidate GC genes in a panel of 10 cell lines. These were termed the Gini Genes.

In many cases, the variation in the expression levels of classical reference genes is really quite huge (e.g. 44 fold for GAPDH in one data set), suggesting that the cure (of using them as normalising genes) may in some cases be worse than the disease (of not doing so). We recommend the present data-driven approach for the selection of reference genes by using the easy-to-calculate and robust GC.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted July 31, 2019.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
The role and robustness of the Gini coefficient as an unbiased tool for the selection of Gini genes for normalising expression profiling data
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
The role and robustness of the Gini coefficient as an unbiased tool for the selection of Gini genes for normalising expression profiling data
Marina Wright Muelas, Farah Mughal, Steve O’Hagan, Philip J. Day, Douglas B. Kell
bioRxiv 718007; doi: https://doi.org/10.1101/718007
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
The role and robustness of the Gini coefficient as an unbiased tool for the selection of Gini genes for normalising expression profiling data
Marina Wright Muelas, Farah Mughal, Steve O’Hagan, Philip J. Day, Douglas B. Kell
bioRxiv 718007; doi: https://doi.org/10.1101/718007

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Molecular Biology
Subject Areas
All Articles
  • Animal Behavior and Cognition (4116)
  • Biochemistry (8820)
  • Bioengineering (6523)
  • Bioinformatics (23470)
  • Biophysics (11798)
  • Cancer Biology (9216)
  • Cell Biology (13327)
  • Clinical Trials (138)
  • Developmental Biology (7440)
  • Ecology (11417)
  • Epidemiology (2066)
  • Evolutionary Biology (15160)
  • Genetics (10442)
  • Genomics (14051)
  • Immunology (9176)
  • Microbiology (22170)
  • Molecular Biology (8817)
  • Neuroscience (47600)
  • Paleontology (350)
  • Pathology (1429)
  • Pharmacology and Toxicology (2492)
  • Physiology (3733)
  • Plant Biology (8084)
  • Scientific Communication and Education (1437)
  • Synthetic Biology (2221)
  • Systems Biology (6039)
  • Zoology (1254)