Abstract
Background Haemonchus contortus is a globally distributed and economically important gastrointestinal pathogen of small ruminants, and has become the key nematode model for studying anthelmintic resistance and other parasite-specific traits among a wider group of parasites including major human pathogens. Two draft genome assemblies for H. contortus were reported in 2013, however, both were highly fragmented, incomplete, and differed from one another in important respects. While the introduction of long-read sequencing has significantly increased the rate of production and contiguity of de novo genome assemblies broadly, achieving high quality genome assemblies for small, genetically diverse, outcrossing eukaryotic organisms such as H. contortus remains a significant challenge.
Results Here, we report using PacBio long read and OpGen and 10X Genomics long-molecule methods to generate a highly contiguous 283.4 Mbp chromosome-scale genome assembly including a resolved sex chromosome. We show a remarkable pattern of almost complete conservation of chromosome content (synteny) with Caenorhabditis elegans, but almost no conservation of gene order. Long-read transcriptome sequence data has allowed us to define coordinated transcriptional regulation throughout the life cycle of the parasite, and refine our understanding of cis- and trans-splicing relative to that observed in C. elegans. Finally, we use this assembly to give a comprehensive picture of chromosome-wide genetic diversity both within a single isolate and globally.
Conclusions The H. contortus MHco3(ISE).N1 genome assembly presented here represents the most contiguous and resolved nematode assembly outside of the Caenorhabditis genus to date, together with one of the highest-quality set of predicted gene features. These data provide a high-quality comparison for understanding the evolution and genomics of Caenorhabditis and other nematodes, and extends the experimental tractability of this model parasitic nematode in understanding pathogen biology, drug discovery and vaccine development, and important adaptive traits such as drug resistance.
Footnotes
Alan Tracey: alt{at}sanger.ac.uk; Nancy Holroyd: neh{at}sanger.ac.uk; Wojtek Bazant: wojtek.bazant{at}sanger.ac.uk; Helen Beasley: hb521{at}cam.ac.uk; Karen Brooks: kd1{at}sanger.ac.uk; Axel Martinelli: axel.martinelli{at}gmail.com; Michael A. Quail: mq1{at}sanger.ac.uk; Faye Rodgers: fr7{at}sanger.ac.uk; Geetha Sankaranarayanan: gs13{at}sanger.ac.uk; Matthew Berriman: mb4{at}sanger.ac.uk; Roz Laing: Rosalind.Laing{at}glasgow.ac.uk; Collette Britton: Collette.Britton{at}glasgow.ac.uk; Kirsty Maitland: Kirsty.Maitland{at}glasgow.ac.uk; Eileen Devaney: Eileen.Devaney{at}glasgow.ac.uk; David Bartley: Dave.Bartley{at}moredun.ac.uk; Robin Beech: robin.beech{at}mcgill.ca; Jennifer D. Noonan: jennifer.noonan{at}mail.mcgill.ca; Neil Sargison: Neil.Sargison{at}ed.ac.uk; Umer Chaudhry: Umer.Chaudhry{at}ed.ac.uk; Michael Paulini: mh6{at}ebi.ac.uk; Kevin Howe: klh{at}ebi.ac.uk; Elizabeth Redman: elmredma{at}ucalgary.ca; Janneke Wit: jwit{at}ucalgary.ca; John Gilleard: jsgillea{at}ucalgary.ca; Guillaume Salle: guillaume.salle{at}inrae.fr; Muhammad Zubair Shabbir: shabbirmz{at}uvas.edu.pk
Abbreviations
- BUSCO
- Benchmarking Universal Single-Copy Orthologs
- CCS
- circular consensus sequencing
- CEGMA
- Core Eukaryotic Genes Mapping Approach
- DCC
- dosage compensation complex
- McMaster
- draft genome assembly of the H. contortus Australian isolate McMaster, published in Schwarz et al. 2013
- DSN
- duplex-specific nuclease
- PCA
- Principal component analysis
- SL
- spliced leader
- V1
- version 1 of the H. contortus MHco3(ISE).N1 genome, published in Laing et al. 2013
- V3
- version 3 of the H. contortus MHco3(ISE).N1 genome, unpublished
- V4
- version 4 of the H. contortus MHco3(ISE).N1 genome, presented here for the first time