%0 Journal Article %A Sarah D. Stellwagen %A Rebecca L. Renberg %T Towards Spider Glue: Long-read scaffolding for extreme length and repetitious silk family genes AgSp1 and AgSp2 with insights into functional adaptation %D 2018 %R 10.1101/492025 %J bioRxiv %P 492025 %X The aggregate gland glycoprotein glue coating the prey-capture threads of orb weaving and cobweb weaving spider webs is comprised of silk protein spidroins (spider fibroins) encoded by two members of the silk gene family. It functions to retain prey that make contact with the web, but differs from solid silk fibers as it is a viscoelastic, amorphic, wet adhesive that is responsive to environmental conditions. Most spidroins are extremely large, highly repetitive genes that are impossible to sequence using only short-read technology. We sequenced for the first time the complete genomic Aggregate Spidroin 1 (AgSp1) and Aggregate Spidroin 2 (AgSp2) glue genes of Argiope trifasciata by using error-prone long reads to scaffold for high accuracy short reads. The massive coding sequences are 42,270 bp (AgSp1) and 20,526 bp (AgSp2) in length, the largest silk genes currently described. The majority of the predicted amino acid sequence of AgSp1 consists of two similar but distinct motifs that are repeated ~40 times each, while AgSp2 contains ~48 repetitions of an AgSp1-similar motif, interspersed by regions high in glutamine. Comparisons of AgSp repetitive motifs from orb web and cobweb spiders show regions of strict conservation followed by striking diversification. Glues from these two spider families have evolved contrasting material properties in adhesion, extensibility, and elasticity, which we link to mechanisms established for related silk genes in the same family. Full-length aggregate spidroin sequences from diverse species with differing material characteristics will provide insights for designing tunable bio-inspired adhesives for a variety of unique purposes.List of AbbreviationsAgSp1Aggregate Spidroin 1AgSp2Aggregate Spidroin 2RMRepeat MotifNTDN-terminal domainNTRN-terminal repeatsNTTN-terminal transitionCTTC-terminal transitionCTDC-terminal domainQRRGlutamine rich region %U https://www.biorxiv.org/content/biorxiv/early/2018/12/10/492025.full.pdf