Re-Annotator: Annotation Pipeline for Microarray Probe Sequences

PLoS One. 2015 Oct 1;10(10):e0139516. doi: 10.1371/journal.pone.0139516. eCollection 2015.

Abstract

Microarray technologies are established approaches for high throughput gene expression, methylation and genotyping analysis. An accurate mapping of the array probes is essential to generate reliable biological findings. However, manufacturers of the microarray platforms typically provide incomplete and outdated annotation tables, which often rely on older genome and transcriptome versions that differ substantially from up-to-date sequence databases. Here, we present the Re-Annotator, a re-annotation pipeline for microarray probe sequences. It is primarily designed for gene expression microarrays but can also be adapted to other types of microarrays. The Re-Annotator uses a custom-built mRNA reference database to identify the positions of gene expression array probe sequences. We applied Re-Annotator to the Illumina Human-HT12 v4 microarray platform and found that about one quarter (25%) of the probes differed from the manufacturer's annotation. In further computational experiments on experimental gene expression data, we compared Re-Annotator to another probe re-annotation tool, ReMOAT, and found that Re-Annotator provided an improved re-annotation of microarray probes. A thorough re-annotation of probe information is crucial to any microarray analysis. The Re-Annotator pipeline is freely available at http://sourceforge.net/projects/reannotator along with re-annotated files for Illumina microarrays HumanHT-12 v3/v4 and MouseRef-8 v2.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • Databases, Nucleic Acid
  • Gene Expression Profiling*
  • Genotype
  • Humans
  • Molecular Sequence Annotation*
  • Oligonucleotide Array Sequence Analysis / methods*
  • Sequence Analysis, DNA / methods*
  • Software*

Grants and funding

This study was supported by the European Union under European Research Council GA no. 281338 (JA). DB was supported by a DFG Fellowship through the Graduate School of Quantitative Biosciences Munich (QBM).