RT Journal Article SR Electronic T1 DeepGANnel: Synthesis of fully annotated single molecule patch-clamp data using generative adversarial networks JF bioRxiv FD Cold Spring Harbor Laboratory SP 2020.06.25.171918 DO 10.1101/2020.06.25.171918 A1 Numan Celik A1 Sam T. M. Ball A1 Elaheh Sayari A1 Lina Abdul Kadir A1 Fiona O’Brien A1 Richard Barrett-Jolley YR 2020 UL http://biorxiv.org/content/early/2020/06/27/2020.06.25.171918.abstract AB Understanding and accurately quantifying ion channel molecule gating in real time is vital for knowledge of cell membrane behaviour, drug discovery and toxicity screening. Doing this with single-molecule resolution first requires the detection of individual protein pore opening and closing transitions and construction of a so-called idealised record which indicates sample-point by samplepoint whether a given molecule is open or closed. Creating this can be difficult, since patch-clamp electrophysiology data can be noisy or contain multiple ion channel molecules. We have recently developed a deep learning model to achieve this called Deep-Channel, but further development is limited by the massive datasets need to train and validate models. In the past, this problem has been tackled by simulation of single molecule activity from Markov models with the addition of pseudo-random noise. In the present report we develop a new method to synthesise raw data, based on generative adversarial networks (GANs). The limitation to direct application of a GAN with this method has been that whilst there are methods to generate classified output image by image, there has been no method to generate an entire timeseries with parallel idealisation, sample-point by sample-point. In this paper, we over-come this problem with DeepGANnel, a model that splits training data raw and parallel idealised data into different rows of image windows and passes these data through a progressive-GAN. This new methodology allows generation of realistic, idealisation synchronised single molecule patch-clamp data, without the biases inherent in pseudorandom simulation methods. This method will be useful for development of single molecule analysis methods and may in the future prove useful for generation of biological models including single molecule resolution stochastic data. The model is easily extendable to other timeseries data requiring parallel labelling, such as labelled ECG.Competing Interest StatementThe authors have declared no competing interest.