RT Journal Article SR Electronic T1 Sequence and annotation of 42 cannabis genomes reveals extensive copy number variation in cannabinoid synthesis and pathogen resistance genes JF bioRxiv FD Cold Spring Harbor Laboratory SP 2020.01.03.894428 DO 10.1101/2020.01.03.894428 A1 Kevin J. McKernan A1 Yvonne Helbert A1 Liam T. Kane A1 Heather Ebling A1 Lei Zhang A1 Biao Liu A1 Zachary Eaton A1 Stephen McLaughlin A1 Sarah Kingan A1 Primo Baybayan A1 Gregory Concepcion A1 Mark Jordan A1 Alberto Riva A1 William Barbazuk A1 Timothy Harkins YR 2020 UL http://biorxiv.org/content/early/2020/01/05/2020.01.03.894428.abstract AB Cannabis is a diverse and polymorphic species. To better understand cannabinoid synthesis inheritance and its impact on pathogen resistance, we shotgun sequenced and assembled a Cannabis trio (sibling pair and their offspring) utilizing long read single molecule sequencing. This resulted in the most contiguous Cannabis sativa assemblies to date. These reference assemblies were further annotated with full-length male and female mRNA sequencing (Iso-Seq) to help inform isoform complexity, gene model predictions and identification of the Y chromosome. To further annotate the genetic diversity in the species, 40 male, female, and monoecious cannabis and hemp varietals were evaluated for copy number variation (CNV) and RNA expression. This identified multiple CNVs governing cannabinoid expression and 82 genes associated with resistance to Golovinomyces chicoracearum, the causal agent of powdery mildew in cannabis. Results indicated that breeding for plants with low tetrahydrocannabinolic acid (THCA) concentrations may result in deletion of pathogen resistance genes. Low THCA cultivars also have a polymorphism every 51 bases while dispensary grade high THCA cannabis exhibited a variant every 73 bases. A refined genetic map of the variation in cannabis can guide more stable and directed breeding efforts for desired chemotypes and pathogen-resistant cultivars. Sequence and annotation of 42 cannabis genomes reveals extensive copy number variation in cannabinoid synthesis and pathogen resistance genes