Building Genomic Analysis Pipelines in a Hackathon Setting with Bioinformatician Teams: DNA-seq, Epigenomics, Metagenomics and RNA-seq

Abstract
We assembled teams of genomics professionals to assess whether we could rapidly develop pipelines to answer biological questions commonly asked by biologists and others new to bioinformatics by facilitating analysis of high-throughput sequencing data. In January 2015, teams were assembled on the National Institutes of Health (NIH) campus to address questions in the DNA-seq, epigenomics, metagenomics and RNA-seq subfields of genomics. The only two rules for this hackathon were that either the data used were housed at the National Center for Biotechnology Information (NCBI) or would be submitted there by a participant in the next six months, and that all software going into the pipeline was open-source or open-use. Questions proposed by organizers, as well as suggested tools and approaches, were distributed to participants a few days before the event and were refined during the event. Pipelines were published on GitHub, a web service providing publicly available, free-usage tiers for collaborative software development (https://github.com/features/). The code was published at https://github.com/DCGenomics/ with separate repositories for each team, starting with hackathon_v001.
Subject Area
- Biochemistry (13864)
- Bioengineering (10567)
- Bioinformatics (33580)
- Biophysics (17308)
- Cancer Biology (14374)
- Cell Biology (20351)
- Clinical Trials (138)
- Developmental Biology (10969)
- Ecology (16198)
- Epidemiology (2067)
- Evolutionary Biology (20508)
- Genetics (13509)
- Genomics (18801)
- Immunology (13932)
- Microbiology (32465)
- Molecular Biology (13523)
- Neuroscience (70791)
- Paleontology (532)
- Pathology (2222)
- Pharmacology and Toxicology (3778)
- Physiology (5957)
- Plant Biology (12145)
- Synthetic Biology (3401)
- Systems Biology (8234)
- Zoology (1868)