LSTrAP-Crowd: Prediction of novel components of bacterial ribosomes with crowd-sourced analysis of RNA sequencing data
Abstract
Bacterial resistance to antibiotics is a growing problem that is projected to cause more deaths than cancer in 2050. Consequently, novel antibiotics are urgently needed. Since more than half of the available antibiotics target the bacterial ribosomes, proteins that are involved in protein synthesis are thus prime targets for the development of novel antibiotics. However, experimental identification of these potential antibiotic target proteins can be labor-intensive and challenging, as these proteins are likely to be poorly characterized and specific to few bacteria. In order to identify these novel proteins, we established a Large-Scale Transcriptomic Analysis Pipeline in Crowd (LSTrAP-Crowd), where 285 individuals processed 26 terabytes of RNA-sequencing data of the 17 most notorious bacterial pathogens. In total, the crowd processed 26,269 RNA-seq experiments and used the data to construct gene co-expression networks, which were used to identify more than a hundred uncharacterized genes that were transcriptionally associated with protein synthesis. We provide the identity of these genes together with the processed gene expression data. The data can be used to identify other vulnerabilities or bacteria, while our approach demonstrates how the processing of gene expression data can be easily crowdsourced.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
The first author's position had to be updated.
Subject Area
- Biochemistry (11703)
- Bioengineering (8722)
- Bioinformatics (29127)
- Biophysics (14932)
- Cancer Biology (12048)
- Cell Biology (17359)
- Clinical Trials (138)
- Developmental Biology (9406)
- Ecology (14143)
- Epidemiology (2067)
- Evolutionary Biology (18268)
- Genetics (12220)
- Genomics (16766)
- Immunology (11841)
- Microbiology (28005)
- Molecular Biology (11552)
- Neuroscience (60808)
- Paleontology (450)
- Pathology (1864)
- Pharmacology and Toxicology (3231)
- Physiology (4939)
- Plant Biology (10384)
- Synthetic Biology (2877)
- Systems Biology (7333)
- Zoology (1642)