RT Journal Article SR Electronic T1 Deep learning and computer vision will transform entomology JF bioRxiv FD Cold Spring Harbor Laboratory SP 2020.07.03.187252 DO 10.1101/2020.07.03.187252 A1 Toke T. Høye A1 Johanna Ärje A1 Kim Bjerge A1 Oskar L. P. Hansen A1 Alexandros Iosifidis A1 Florian Leese A1 Hjalte M. R. Mann A1 Kristian Meissner A1 Claus Melvad A1 Jenni Raitoharju YR 2020 UL http://biorxiv.org/content/early/2020/07/04/2020.07.03.187252.abstract AB Most animal species on Earth are insects, and recent reports suggest that their abundance is in drastic decline. Although these reports come from a wide range of insect taxa and regions, the evidence to assess the extent of the phenomenon is still sparse. Insect populations are challenging to study and most monitoring methods are labour intensive and inefficient. Advances in computer vision and deep learning provide potential new solutions to this global challenge. Cameras and other sensors that can effectively, continuously, and non-invasively perform entomological observations throughout diurnal and seasonal cycles. The physical appearance of specimens can also be captured by automated imaging in the lab. When trained on these data, deep learning models can provide estimates of insect abundance, biomass, and diversity. Further, deep learning models can quantify variation in phenotypic traits, behaviour, and interactions. Here, we connect recent developments in deep learning and computer vision to the urgent demand for more cost-efficient monitoring of insects and other invertebrates. We present examples of sensor-based monitoring of insects. We show how deep learning tools can be applied to the big data outputs to derive ecological information and discuss the challenges that lie ahead for the implementation of such solutions in entomology. We identify four focal areas, which will facilitate this transformation: 1) Validation of image-based taxonomic identification, 2) generation of sufficient training data, 3) development of public, curated reference databases, and 4) solutions to integrate deep learning and molecular tools.Significance statement Insect populations are challenging to study, but computer vision and deep learning provide opportunities for continuous and non-invasive monitoring of biodiversity around the clock and over entire seasons. These tools can also facilitate the processing of samples in a laboratory setting. Automated imaging in particular can provide an effective way of identifying and counting specimens to measure abundance. We present examples of sensors and devices of relevance to entomology and show how deep learning tools can convert the big data streams into ecological information. We discuss the challenges that lie ahead and identify four focal areas to make deep learning and computer vision game changers for entomology.Competing Interest StatementThe authors have declared no competing interest.Bin pickingan industrial term for robots that pick up one of many objects randomly placed in a container.Convolutional Neural Network (CNN)a deep learning algorithm in the family of neural networks with serval different layers commonly applied for image recognition and classification. A CNN can be trained to recognize various objects and patterns in an image. There are four main different operations in a CNN: convolution, activation functions, sub sampling, and fully connected layer. During training the learnable parameters of each convolutional and fully connected layer are adjusted so the CNN is able to recognize different patterns of the training data and used for final image classification.Data augmentationa technique that can be used to artificially expand the size of a training dataset by creating modified images with objects of interest for classification.Machine learninga subset of artificial intelligence associated with creating algorithms that can change themselves without human intervention to get the desired result – by feeding themselves through structured data.Deep learninga subset of machine learning where algorithms are created and function similarly to machine learning, but where there are many levels of these algorithms, each providing a different interpretation of the data it conveys.DNA barcodingIdentification of a species using a short, standardised gene fragment.Initializationdescription of an object to be tracked.Training dataclassified images (e.g. images of known species identified by experts) that are recorded to train a deep learning model.Precisionthe number of true positives divided by the sum of true positives and false positivesRecallalso called the true positive rate, is the number of true positives divided by the sum of true positives and false negatives.Classification accuracythe sum of true positives and true negatives divided by the total number of specimens.