RT Journal Article SR Electronic T1 Optical and physical mapping with local finishing enables megabase-scale resolution of agronomically important regions in the wheat genome JF bioRxiv FD Cold Spring Harbor Laboratory SP 363465 DO 10.1101/363465 A1 G. Keeble-Gagnère A1 P. Rigault A1 J. Tibbits A1 R. Pasam A1 M. Hayden A1 K. Forrest A1 Z. Frenkel A1 A. Korol A1 E. Huang A1 C. Cavanagh A1 J. Taylor A1 M. Abrouk A1 A. Sharpe A1 D. Konkin A1 P. Sourdille A1 B. Darrier A1 F. Choulet A1 A. Bernard A1 S. Rochfort A1 AM. Dimech A1 N. Watson-Haigh A1 U. Baumann A1 P. Eckermann A1 D. Fleury A1 A. Juhasz A1 S. Boisvert A1 M-A. Nolin A1 J. Doležel A1 H. Šimková A1 H. Toegelová A1 Jan Šafář A1 M-C. Luo A1 F. Camara A1 M. Pfeifer A1 D. Isdale A1 J. Nystrom-Persson A1 IWGSC A1 D-H Koo A1 M. Tinning A1 D. Cui A1 Z. Ru A1 R. Appels YR 2018 UL http://biorxiv.org/content/early/2018/07/09/363465.abstract AB Background Numerous scaffold-level sequences for wheat are now being released and, in this context, we report on a strategy for improving the overall assembly to a level comparable to that of the human genome.Results Using chromosome 7A of wheat as a model, sequence-finished megabase scale sections of this chromosome were established by combining a new independent assembly based on a BAC-based physical map, BAC pool paired end sequencing, chromosome arm specific mate-pair sequencing and Bionano optical mapping with the IWGSC RefSeq v1.0 sequence and its underlying raw data. The combined assembly results in 18 super-scaffolds across the chromosome. The value of finished genome regions is demonstrated for two approximately 2.5 Mb regions associated with yield and the grain quality phenotype of fructan carbohydrate grain levels. In addition, the 50 Mb centromere region analysis incorporates cytological data highlighting the importance of non-sequence data in the assembly of this complex genome region.Conclusions Sufficient genome sequence information is shown to be now available for the wheat community to produce sequence-finished releases of each chromosome of the reference genome. The high-level completion identified that an array of seven fructosyl transferase genes underpins grain quality and yield attributes are affected by five f-box-only-protein-ubiquitin ligase domain and four root-specific lipid transfer domain genes. The completed sequence also includes the centromere.Contigconsensus region of DNA sequence represented by overlapping sequence reads. Can have unresolved bases (N), but no gaps.Scaffoldconsensus region of DNA sequence represented by ordered (but not necessarily oriented) contigs, separated by gaps of known (estimated) length.Islandgenomic region represented by overlapping sets of DNA sequences (scaffolds), physical entities (optical map or molecule, physical clone), or both.Super-scaffolda portion of the genome sequence where scaffolds have been ordered and oriented relative to each other