ReviewTB database 2010: Overview and update
Section snippets
Overview
TBDB (tbdb.org) is an online database that provides integrated access through a single portal to sequence data and annotation, expression data, literature curation, and analysis tools for Tuberculosis. Data integrated in TBDB include:
- •
Genome sequences for publicly available strains of Mycobacterium tuberculosis (MTB),
- •
Genome sequences for over 20 strains related to MTB including M. africanum, M. bovis, M. avium, M. leprae, and M. smegmatus.
- •
Global sequence polymorphism data for M. tuberculosis.
- •
Organisms in TBDB
TBDB houses genome sequence data for a range of species relevant to tuberculosis. Primary among these data is the sequence for M. tuberculosis strain H37Rv – the standard lab strain long used for experimental and animal infection studies. Also available are other publicly available M. tuberculosis strain assemblies including those for strains CDC1551, F11, C, Haarlem, and H37Ra. In addition, as described below, new to TBDB are short-read re-sequencing data for 30 M. tuberculosis strains
TB genetic diversity
Data from the M. tuberculosis Phylogeographic Diversity Sequencing Project are a recent addition to TBDB. Led by Sebastien Gagneux and Peter Small in collaboration with the NIAID-funded Broad Genomic Sequencing Center for Infectious Disease (http://www.broadinstitute.org/science/projects/gscid/genomic-sequencing-center-infectious-diseases), this project builds on existing models of TB global population structure10 by re-sequencing 31 TB strains that were carefully selected as representatives of
TB metabolic network reconstruction
The metabolic pathways and reactions of each organism in TBDB are now represented as a Biocyc Pathway/Genome database (http://biocyc.org/) within TBDB (and directly at http://tbcyc.tbdb.org) (Figure 7). Originally, these databases were created as a collaboration between SRI International and Stanford University and subsequent updating of the dataset was assumed in 2006 by BioHealthBase BRC now FluDB. In 2009, TBDB adopted the TB pathways collection and reconstructed the metabolic network for
Expression data
TBDB provides researchers a suite of tools to explore, visualize and analyze publicly-available gene expression data generated from both the bacterial pathogen itself and its human and mouse hosts. Most gene expression data in TBDB were generated using microarrays, but TBDB also houses data generated using quantitative RT-PCR (and the sequences of the probes and primers of the validated TaqMan sets used to obtain the RT-PCR data), and is actively working to incorporate tools to explore RNA-seq
Publications
TBDB has associated several thousand publications with M. tuberculosis genes, and includes dozens of publications associated with gene expression data. Most of the gene expression data in TBDB are associated with a publication, and are thus well annotated with the experimental details. Data associated with a gene expression publication may be downloaded and are available for viewing and manipulation using the aforementioned tools. In addition to standard searches such as keyword, author name,
Gene regulation
The integration of genome sequence data and expression data provides the opportunity to view both in the context of the TB gene regulatory network. TBDB provides a growing set of tools for analyzing gene regulation (Figure 11), all of which are accessible through the Tool Bar on each gene details page.
The most fundamental unit of gene regulation in bacteria is the operon. Conservation of gene order and orientation between adjacent genes have proven to be strong indicators of operon structure.18
Future plans
In the next two years, TBDB plans to consolidate and strengthen its current suite of databases and tools and to expand into four additional areas of vital interest to the TB Research community:
- •
Enhanced user interface and training. Data from Google analytics show that TBDB is accessed by more than 1,400 unique users each week. To further increase the utility of the site for the TB research community we have solicited and received written critiques of the site from several independent reviewers.
Acknowledgements
Support for TBDB was provided by the Bill; Melinda Gates Foundation. The TB metabolic maps were originally created as a collaboration between SRI International and Stanford University and was funded by DARPA under contract N66001-01-C-8011 and by the NIH NIAID under grant AI44826. Additional enhancements were provided in 2006 by BioHealthBase BRC under contract from the NIH NIAID. We are grateful to the research community for their valuable input and suggestions in building and maintaining this
References (22)
- et al.
Basic local alignment search tool
J Mol Biol
(1990) - et al.
TB database: an integrated platform for tuberculosis research
Nucleic Acids Res
(2009) - et al.
tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence
Nucleic Acids Res
(1997) - et al.
Pfam: clans, web tools and services
Nucleic Acids Res
(2006) - et al.
Rfam: updates to the RNA families database
Nucleic Acids Res
(2009) - et al.
Transposon site hybridization in mycobacterium tuberculosis
Methods Mol Biol
(2008) - et al.
Genes required for mycobacterial growth defined by high density mutagenesis
Mol Microbiol
(2003) - et al.
Comprehensive identification of conditionally essential genes in mycobacteria
Proc Natl Acad Sci U S A
(2001) - et al.
Jalview Version 2–a multiple sequence alignment editor and analysis workbench
Bioinformatics
(2009) - et al.
High functional diversity in Mycobacterium tuberculosis driven by genetic drift and human demography
PLoS Biol
(2008)
a web-based genome-scale network model of Mycobacterium tuberculosis metabolism
Genome Biol
Cited by (93)
A tuberculosis biomarker database: the key to novel TB diagnostics
2017, International Journal of Infectious DiseasesMycobacterium tuberculosis Rv1474c is a TetR-like transcriptional repressor that regulates aconitase, an essential enzyme and RNA-binding protein, in an iron-responsive manner
2017, TuberculosisCitation Excerpt :p < 0.05 was considered as significant. Rv1474c is annotated as a probable transcriptional regulator on the basis of the sequence similarity with related genes from other bacterial species, including those from corynebacteria, nocardia and other mycobacteria [57,58]. The genomic organization of Rv1474c places it immediately downstream to acn (Rv1475c) in H37Rv genome with 11 bases in-between the two ORFs.
Bioinformatics tools and databases for whole genome sequence analysis of Mycobacterium tuberculosis
2016, Infection, Genetics and EvolutionComparative Proteomic Analyses of Avirulent, Virulent, and Clinical Strains of Mycobacterium tuberculosis Identify Strain-specific Patterns
2016, Journal of Biological Chemistry