De novo assembly of a young Drosophila Y chromosome using single-molecule sequencing and chromatin conformation capture

PLoS Biol. 2018 Jul 30;16(7):e2006348. doi: 10.1371/journal.pbio.2006348. eCollection 2018 Jul.

Abstract

While short-read sequencing technology has resulted in a sharp increase in the number of species with genome assemblies, these assemblies are typically highly fragmented. Repeats pose the largest challenge for reference genome assembly, and pericentromeric regions and the repeat-rich Y chromosome are typically ignored from sequencing projects. Here, we assemble the genome of Drosophila miranda using long reads for contig formation, chromatin interaction maps for scaffolding and short reads, and optical mapping and bacterial artificial chromosome (BAC) clone sequencing for consensus validation. Our assembly recovers entire chromosomes and contains large fractions of repetitive DNA, including about 41.5 Mb of pericentromeric and telomeric regions, and >100 Mb of the recently formed highly repetitive neo-Y chromosome. While Y chromosome evolution is typically characterized by global sequence loss and shrinkage, the neo-Y increased in size by almost 3-fold because of the accumulation of repetitive sequences. Our high-quality assembly allows us to reconstruct the chromosomal events that have led to the unusual sex chromosome karyotype in D. miranda, including the independent de novo formation of a pair of sex chromosomes at two distinct time points, or the reversion of a former Y chromosome to an autosome.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Animals
  • Base Sequence
  • Centromere / metabolism
  • Chromatin / chemistry*
  • Drosophila / genetics*
  • Evolution, Molecular
  • Genes, Insect
  • Karyotype
  • Male
  • Nucleic Acid Conformation*
  • Repetitive Sequences, Nucleic Acid / genetics
  • Reproducibility of Results
  • Sequence Analysis, DNA*
  • Y Chromosome / genetics*

Substances

  • Chromatin