RT Journal Article SR Electronic T1 Continuous generation of tandem transposable elements in Drosophila populations provides a substrate for the evolution of satellite DNA JF bioRxiv FD Cold Spring Harbor Laboratory SP 158386 DO 10.1101/158386 A1 Michael P. McGurk A1 Daniel A. Barbash YR 2017 UL http://biorxiv.org/content/early/2017/07/02/158386.abstract AB Eukaryotic genomes are replete with repeated sequences, in the form of transposable elements (TEs) dispersed across the genome or as satellite arrays, large stretched of tandemly repeated sequence. Both types of repeats have attracted much interest due to their considerable variation across species, unusual patterns of evolution, and implications in human disease. The boundary between these categories of repeats is fuzzy, however: Many satellite sequences clearly originated as TEs, though it is unclear how mobile genetic parasites occasionally transform into sometimes megabase-sized arrays. Whatever the generative mechanism, in Drosophila melanogaster two TE families have undergone this transition, hinting that its early stages might be observable in a survey of population variation. However, the best available population resources, short-read DNA sequences, are often considered to be of limited utility for the analysis of repetitive DNA due to the challenge of mapping individual repeats to unique genomic locations. Here we develop a new pipeline called ConTExt which demonstrates that reference-free analysis of paired-end Illumina data can identify a wide range of structures of repetitive DNA and reveal a quantitative understanding of repeat sequence variation. Analyzing 85 genomes from five populations of Drosophila melanogaster we discover that all three classes of TEs commonly exist as tandem dimers. Our results further suggest that insertion site preference is the major mechanism driving the creation of tandem dimers and that, consequently, dimers form rapidly during periods of active transposition. This abundance of TE dimers has the potential to provide the source material for future expansion into satellite arrays, and we discover one such rare copy number expansion of the DNA transposon Hobo to ∼16 tandem copies in a single line. The very process that defines TEs—transposition—, thus, constantly adds to the library of sequences from which new satellite DNAs arise.