RT Journal Article SR Electronic T1 Euplotid JF bioRxiv FD Cold Spring Harbor Laboratory SP 170159 DO 10.1101/170159 A1 Diego Borges YR 2017 UL http://biorxiv.org/content/early/2017/08/03/170159.abstract AB http://dborgesr.github.io/Euplotid/Euplotid is composed of a set of constantly evolving bioinformatic pipelines encapsulated and running in Docker containers enabling a user to build and annotate the local regulatory structure of every gene starting from raw sequencing reads of DNA-interactions, chromatin accessibility, and RNA-sequencing. Reads are quantified using the latest computational tools and the results are normalized, quality-checked, and stored. The local regulatory neighborhood of each gene is built using a Louvain based graph partitioning algorithm parameterized by the chromatin extrusion model and CTCF-CTCF interactions. Cis-Regulatory Elements are defined using chromatin accessibility peaks which are then mapped to Transcription Start Sites based on inclusion within the same neighborhood. Convolutional Neural Networks are combined with Long-Short Term Memory in order to provide a statistical model mimicking transcription factor binding, one neural network for each protein in the genome is trained on all available Chip-Seq and SELEX data, learning what pattern of DNA oligonucleotides the factor binds. The neural networks are then merged and trained on chromatin accessibility data, building a rationally designed neural network architecture capable of predicting chromatin accessibility. Transcription factor binding and identity at each peak is annotated using this trained neural network architecture. By in-silico mutating and re-applying the neural network we are able to gauge the impact of a transition mutation on the binding of any human transcription factor. The annotated output can be visualized in a variety of 1D, 2D and 3D ways overlaid with existing bodies of knowledge, such as GWAS results. Once a particular CRE of interest has been identified by a biologist the difficulty of a Base Editor 2 (BE2) mediated transition mutation can be quantitatively assessed and induced in a model organism.Figure 0.1: Graphical AbstractFigure 1.1: Detailed Abstract