RT Journal Article SR Electronic T1 An Autoencoder and Artificial Neural Network-based Method to Estimate Parity Status of Wild Mosquitoes from Near-infrared Spectra JF bioRxiv FD Cold Spring Harbor Laboratory SP 2020.01.25.919878 DO 10.1101/2020.01.25.919878 A1 Masabho P. Milali A1 Samson S. Kiware A1 Nicodem J. Govella A1 Fredros Okumu A1 Naveen Bansal A1 Serdar Bozdag A1 Jacques D. Charlwood A1 Marta Maia A1 Sheila B. Ogoma A1 Floyd E. Dowell A1 George F. Corliss A1 Maggy T. Sikulu-Lord A1 Richard J. Povinelli YR 2020 UL http://biorxiv.org/content/early/2020/02/09/2020.01.25.919878.abstract AB Background After mating, female mosquitoes need animal blood to develop their eggs. In the process of acquiring blood, they may acquire pathogens, which may cause different diseases to humans such as malaria, zika, dengue, and chikungunya. Therefore, knowing the parity status of mosquitoes is useful in control and evaluation of infectious diseases transmitted by mosquitoes, where parous mosquitoes are assumed to be potentially infectious. Ovary dissections, which currently are used to determine the parity status of mosquitoes, are very tedious and limited to very few experts. An alternative to ovary dissections is near-infrared spectroscopy (NIRS), which can estimate the age in days and the infectious state of laboratory and semi-field reared mosquitoes with accuracies between 80 and 99%. No study has tested the accuracy of NIRS for estimating the parity status of wild mosquitoes.Methods and results In this study, we train artificial neural network (ANN) models on NIR spectra to estimate the parity status of wild mosquitoes. We use four different datasets: An. arabiensis collected from Minepa, Tanzania (Minepa-ARA); An. gambiae collected from Muleba, Tanzania (Muleba-GA); An. gambiae collected from Burkina Faso (Burkina-GA); and An.gambiae from Muleba and Burkina Faso combined (Muleba-Burkina-GA). We train ANN models on datasets with spectra preprocessed according to previous protocols. We then use autoencoders to reduce the spectra feature dimensions from 1851 to 10 and re-train ANN models. Before the autoencoder was applied, ANN models estimated parity status of mosquitoes in Minepa-ARA, Muleba-GA, Burkina-GA and Muleba-Burkina-GA with out-of-sample accuracies of 81.9 ± 2.8% (N=927), 68.7 ± 4.8% (N=140), 80.3 ± 2.0% (N=158), and 75.7 ± 2.5% (N=298), respectively. With the autoencoder, ANN models tested on out-of-sample data achieved 97.1 ± 2.2%, (N=927), 89.8 ± 1.7% (N=140), 93.3 ± 1.2% (N=158), and 92.7 ± 1.8% (N=298) accuracies for Minepa-ARA, Muleba-GA, Burkina-GA, and Muleba-Burkina-GA, respectively.Conclusion These results show that a combination of an autoencoder and an ANN trained on NIR spectra to estimate parity status of wild mosquitoes yields models that can be used as an alternative tool to estimate parity status of wild mosquitoes, especially since NIRS is a high-throughput, reagent-free, and simple-to-use technique compared to ovary dissections.