Abstract
Background The genus Aethionema is a sister-group to the core-group of the Brassicaceae family that includes Arabidopsis thaliana and the Brassica crops. Thus, Aethionema is phylogenetically well-placed for the investigation and understanding of genome and trait evolution across the family. We aimed to improve the quality of the reference genome draft version of the annual species Aethionema arabicum. Secondly, we constructed the first Ae. arabicum genetic map. The improved reference genome and genetic map enabled the development of each other.
Results We started with the initially published genome (version 2.5). PacBio and MinION sequencing together with genetic map v2.5 were incorporated to produce the new reference genome v3.0. The improved genome contains 203 MB of sequence, with approximately 94% of the assembly made up of called bases, assembled into 2,883 scaffolds. The N50 (10.3 MB) represents an 80-fold over the initial genome release. We generated a Recombinant Inbred Line (RIL) population that was derived from two ecotypes: Cyprus and Turkey (the reference genotype. Using a Genotyping by Sequencing (GBS) approach, we generated a high-density genetic map with 749 (v2.5) and then 632 SNPs (v3.0) was generated. The genetic map and reference genome were integrated, thus greatly improving the scaffolding of the reference genome into 11 linkage groups.
Conclusions We show that long-read sequencing data and genetics are complementary, resulting in an improved genome assembly in Ae. arabicum. They will facilitate comparative genetic mapping work for the Brassicaceae family and are also valuable resources to investigate wide range of life history traits in Aethionema.
Footnotes
T-P.N.: phuong.nguyen{at}wur.nl, C.M.: cornelia.muehlich{at}biologie.uni-marburg.de, S.M.: setareh.mohammadin{at}wur.nl, E.vd.B.: evdbergh{at}ebi.ac.uk, A.E.P.: aplatts{at}nyu.edu, F.B.H: fabian.haas{at}biologie.uni-marburg.de, S.A.R.: stefan.rensing{at}biologie.uni-marburg.de, M.E.S: eric.schranz{at}wur.nl
List of abbreviations
- BAC
- Bacterial Artificial Chromosome
- BS-seq
- Bisulfite sequencing
- ChIP-seq
- Chromatin ImmunoPrecipitation DNA-Sequencing
- CYP
- Cyprus ecotype
- FPC
- Finger Printed Contigs
- GFF
- Generic/General Feature Format
- LG
- Linkage group
- MinION
- Oxford Nanopore MinION
- PacBio
- Pacific Biosciences
- QTL
- Quantitative trait loci
- RIL
- Recombinant Inbred Line
- SNP
- Single Nucleotide Polymorphism
- TUR
- Turkey ecotype
- WGP
- Whole Genome Profiling