Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

High resolution species assignment of Anopheles mosquitoes using k-mer distances on targeted sequences

View ORCID ProfileMarilou Boddé, View ORCID ProfileAlex Makunin, View ORCID ProfileDiego Ayala, View ORCID ProfileLemonde Bouafa, View ORCID ProfileAbdoulaye Diabaté, View ORCID ProfileUwem Friday Ekpo, View ORCID ProfileMahamadi Kientega, View ORCID ProfileGilbert Le Goff, View ORCID ProfileBoris K. Makanga, Marc F. Ngangue, View ORCID ProfileOlaitan Olamide Omitola, View ORCID ProfileNil Rahola, View ORCID ProfileFrederic Tripet, View ORCID ProfileRichard Durbin, View ORCID ProfileMara K. N. Lawniczak
doi: https://doi.org/10.1101/2022.03.18.484650
Marilou Boddé
1Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK
2Wellcome Sanger Institute, Hinxton, CB10 1SA, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Marilou Boddé
Alex Makunin
2Wellcome Sanger Institute, Hinxton, CB10 1SA, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Alex Makunin
Diego Ayala
3Institut de Recherche pour le Développement, MIVEGEC, Univ. Montpellier, CNRS, IRD, Montpellier, 34394, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Diego Ayala
Lemonde Bouafa
3Institut de Recherche pour le Développement, MIVEGEC, Univ. Montpellier, CNRS, IRD, Montpellier, 34394, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Lemonde Bouafa
Abdoulaye Diabaté
4Institut de Recherche en Sciences de la Santé, Direction Régionale de l’Ouest, 399 Avenue de la Liberté, (+226) 20981880, Bobo-Dioulasso, Burkina Faso
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Abdoulaye Diabaté
Uwem Friday Ekpo
5Federal University of Agriculture Abeokuta, Alabata 110001, Abeokuta, Ogun State, Nigeria
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Uwem Friday Ekpo
Mahamadi Kientega
4Institut de Recherche en Sciences de la Santé, Direction Régionale de l’Ouest, 399 Avenue de la Liberté, (+226) 20981880, Bobo-Dioulasso, Burkina Faso
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Mahamadi Kientega
Gilbert Le Goff
3Institut de Recherche pour le Développement, MIVEGEC, Univ. Montpellier, CNRS, IRD, Montpellier, 34394, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Gilbert Le Goff
Boris K. Makanga
6Institut de Recherche en Ecologie Tropicale, BP13354, Libreville, Gabon
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Boris K. Makanga
Marc F. Ngangue
7Centre International de Recherches Medicales de Franceville, Franceville, Gabon
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Olaitan Olamide Omitola
5Federal University of Agriculture Abeokuta, Alabata 110001, Abeokuta, Ogun State, Nigeria
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Olaitan Olamide Omitola
Nil Rahola
3Institut de Recherche pour le Développement, MIVEGEC, Univ. Montpellier, CNRS, IRD, Montpellier, 34394, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Nil Rahola
Frederic Tripet
8Centre for Applied Entomology and Parasitology, Keele University, Newcastle, ST5 5BG, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Frederic Tripet
Richard Durbin
1Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK
2Wellcome Sanger Institute, Hinxton, CB10 1SA, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Richard Durbin
Mara K. N. Lawniczak
2Wellcome Sanger Institute, Hinxton, CB10 1SA, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Mara K. N. Lawniczak
  • For correspondence: mara@sanger.ac.uk
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

The ANOSPP amplicon panel is a genus-wide targeted sequencing panel to facilitate large-scale monitoring of Anopheles species diversity. Combining information from the 62 nuclear amplicons present in the ANOSPP panel allows for a more nuanced species assignment than single gene (e.g. COI) barcoding, which is desirable in the light of permeable species boundaries. Here, we present NNoVAE, a method using Nearest Neighbours (NN) and Variational Autoencoders (VAE), which we apply to k-mers resulting from the ANOSPP amplicon sequences in order to hierarchically assign species identity. The NN step assigns a sample to a species-group by comparing the k-mers arising from each haplotype’s amplicon sequence to a reference database. The VAE step is required to distinguish between closely related species, and also has sufficient resolution to reveal population structure within species. In tests on independent samples with over 80% amplicon coverage, NNoVAE correctly classifies to species level 98% of samples within the An. gambiae complex and 89% of samples outside the complex. We apply NNoVAE to over two thousand new samples from Burkina Faso and Gabon, identifying unexpected species in Gabon. NNoVAE presents an approach that may be of value to other targeted sequencing panels, and is a method that will be used to survey Anopheles species diversity and Plasmodium transmission patterns through space and time on a large scale, with plans to analyse half a million mosquitoes in the next five years.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

  • ↵* mmb52{at}camb.ac.uk, mara{at}sanger.ac.uk

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted March 20, 2022.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
High resolution species assignment of Anopheles mosquitoes using k-mer distances on targeted sequences
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
High resolution species assignment of Anopheles mosquitoes using k-mer distances on targeted sequences
Marilou Boddé, Alex Makunin, Diego Ayala, Lemonde Bouafa, Abdoulaye Diabaté, Uwem Friday Ekpo, Mahamadi Kientega, Gilbert Le Goff, Boris K. Makanga, Marc F. Ngangue, Olaitan Olamide Omitola, Nil Rahola, Frederic Tripet, Richard Durbin, Mara K. N. Lawniczak
bioRxiv 2022.03.18.484650; doi: https://doi.org/10.1101/2022.03.18.484650
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
High resolution species assignment of Anopheles mosquitoes using k-mer distances on targeted sequences
Marilou Boddé, Alex Makunin, Diego Ayala, Lemonde Bouafa, Abdoulaye Diabaté, Uwem Friday Ekpo, Mahamadi Kientega, Gilbert Le Goff, Boris K. Makanga, Marc F. Ngangue, Olaitan Olamide Omitola, Nil Rahola, Frederic Tripet, Richard Durbin, Mara K. N. Lawniczak
bioRxiv 2022.03.18.484650; doi: https://doi.org/10.1101/2022.03.18.484650

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Evolutionary Biology
Subject Areas
All Articles
  • Animal Behavior and Cognition (4858)
  • Biochemistry (10802)
  • Bioengineering (8046)
  • Bioinformatics (27314)
  • Biophysics (13987)
  • Cancer Biology (11130)
  • Cell Biology (16062)
  • Clinical Trials (138)
  • Developmental Biology (8788)
  • Ecology (13298)
  • Epidemiology (2067)
  • Evolutionary Biology (17364)
  • Genetics (11689)
  • Genomics (15926)
  • Immunology (11034)
  • Microbiology (26114)
  • Molecular Biology (10655)
  • Neuroscience (56597)
  • Paleontology (418)
  • Pathology (1733)
  • Pharmacology and Toxicology (3005)
  • Physiology (4551)
  • Plant Biology (9644)
  • Scientific Communication and Education (1615)
  • Synthetic Biology (2690)
  • Systems Biology (6979)
  • Zoology (1510)