PT - JOURNAL ARTICLE AU - Mario González-Jiménez AU - Simon A. Babayan AU - Pegah Khazaeli AU - Margaret Doyle AU - Finlay Walton AU - Elliott Reedy AU - Thomas Glew AU - Mafalda Viana AU - Lisa Ranford-Cartwright AU - Abdoulaye Niang AU - Doreen J. Siria AU - Fredros O. Okumu AU - Abdoulaye Diabaté AU - Heather M. Ferguson AU - Francesco Baldini AU - Klaas Wynne TI - Prediction of malaria mosquito species and population age structure using mid-infrared spectroscopy and supervised machine learning AID - 10.1101/414342 DP - 2018 Jan 01 TA - bioRxiv PG - 414342 4099 - http://biorxiv.org/content/early/2018/09/12/414342.short 4100 - http://biorxiv.org/content/early/2018/09/12/414342.full AB - Despite the global efforts made in the fight against malaria, the disease is resurging. One of the main causes is the resistance that Anopheles mosquitoes, vectors of the disease, have developed to insecticides. Anopheles must survive for at least 12 days to possibly transmit malaria. Therefore, to evaluate and improve malaria vector control interventions, it is imperative to monitor and accurately estimate the age distribution of mosquito populations as well as total population sizes. However, estimating mosquito age is currently a slow, imprecise, and labour-intensive process that can only distinguish under-from over-four-day-old female mosquitoes. Here, we demonstrate a machine-learning based approach that utilizes mid-infrared spectra of mosquitoes to characterize simultaneously, and with unprecedented accuracy, both age and species identity of females of the malaria vectors Anopheles gambiae and An. arabiensis mosquitoes within their respective populations. The prediction of the age structures was statistically indistinguishable from true modelled distributions. The method has a negligible cost per mosquito, does not require highly trained personnel, is substantially faster than current techniques, and so can be easily applied in both laboratory and field settings. Our results show that, with larger mid-infrared spectroscopy data sets, this technique can be further improved and expanded to vectors of other diseases such as Zika and Dengue.