Abstract
Background Hermetia illucens L. (Diptera: Stratiomyidae), the Black Soldier Fly (BSF) is an increasingly important mass reared entomological resource for bioconversion of organic material into animal feed.
Results We generated a high-quality chromosome-scale genome assembly of the BSF using Pacific Bioscience, 10X Genomics linked read and high-throughput chromosome conformation capture sequencing technology. Scaffolding the final assembly with Hi-C data produced a highly contiguous 1.01 Gb genome with 99.75% of scaffolds assembled into pseudo-chromosomes representing seven chromosomes with 16.01 Mb contig and 180.46 Mb scaffold N50 values. The highly complete genome obtained a BUSCO completeness of 98.6%. We masked 67.32% of the genome as repetitive sequences and annotated a total of 17,664 protein-coding genes using the BRAKER2 pipeline. We analysed an established lab population to investigate the genomic variation and architecture of the BSF revealing six autosomes and the identification of an X chromosome. Additionally, we estimated the inbreeding coefficient (1.9%) of a lab population by assessing runs of homozygosity. This revealed a plethora of inbreeding events including recent long runs of homozygosity on chromosome five.
Conclusions Release of this novel chromosome-scale BSF genome assembly will provide an improved platform for further genomic studies and functional characterisation of candidate regions of artificial selection. This reference sequence will provide an essential tool for future genetic modifications, functional and population genomics.
Competing Interest Statement
This study was supported in kind by Better Origin. MP is CSO at Better Origin. CDJ is a scientific advisor at Better Origin.
List of abbreviations
- BSF
- Black Soldier Fly
- BSFL
- Black Soldier Fly Larvae
- PacBio
- Pacific Biosciences
- BUSCO
- Benchmarking Universal Single-Copy Orthologs
- GC
- Guaninecytosine
- Gb
- gigabase
- Mb
- megabase
- kb
- kilobase
- bp
- base pairs
- RNA-seq
- RNA-sequencing
- Hi-C
- high-throughput chromosome conformation capture
- BWA
- Burrows-Wheeler Aligner
- PE
- paired-end
- SMRT
- single-molecule realtime
- SRA
- Sequence Read Archive
- LINE
- long interspersed nuclear elements
- SINE
- Short Interspersed Nuclear Elements
- LTR
- long terminal repeat
- STAR
- Spliced Transcripts Alignment to a Reference
- ROH
- Runs Of Homozygosity
- SNP
- Single Nucleotide Polymorphism.