The Significance of Digital Gene Expression Profiles

  1. Stéphane Audic and
  2. Jean-Michel Claverie1
  1. Laboratory of Structural and Genetic Information, Centre National de la Recherche Scientifique–E.P.91, Marseille 13402, France

Abstract

Genes differentially expressed in different tissues, during development, or during specific pathologies are of foremost interest to both basic and pharmaceutical research. “Transcript profiles” or “digital Northerns” are generated routinely by partially sequencing thousands of randomly selected clones from relevant cDNA libraries. Differentially expressed genes can then be detected from variations in the counts of their cognate sequence tags. Here we present the first systematic study on the influence of random fluctuations and sampling size on the reliability of this kind of data. We establish a rigorous significance test and demonstrate its use on publicly available transcript profiles. The theory links the threshold of selection of putatively regulated genes (e.g., the number of pharmaceutical leads) to the fraction of false positive clones one is willing to risk. Our results delineate more precisely and extend the limits within which digital Northern data can be used.

Footnotes

  • 1 Corresponding author.

  • E-MAIL jmc{at}igs.cnrs-mrs.fr; FAX 334 91 16 45 49.

    • Received March 12, 1997.
    • Accepted August 22, 1997.
| Table of Contents

Preprint Server