A methyl-seq tool to capture genomic imprinted loci

Genomic imprinting represents an original model of epigenetic regulation resulting in a parent-of-origin expression. Despite the critical role of imprinted genes in mammalian growth, metabolism and neuronal function, there is no molecular tool specifically targeting them for a systematic evaluation. Here, we optimized and compared to bisulfite-based standard a novel methyl-seq system to capture 165 candidate regions for genomic imprinting and ultimately detect parent-of-origin methylation, the main hallmark of imprinting.


22
imprinting and ultimately detect parent-of-origin methylation, the main hallmark of imprinting.

24
Genomic imprinting (GI) is an original molecular phenomenon mediated by the apposition of epigenetic marks 25 (DNA methylation and/or histone marks) leading to allele-specific expression dependent on the parental origin 1 . GI 26 studies intersect with a broad range of biological fields, including evolution biology, developmental biology, 27 molecular genetics and epigenetics. GI is involved in many phenotypes in humans but also contributes to the 28 variability of major agronomic phenotypes 2,3 . Imprinted genes are therefore highly attractive targets and 29 biomarkers 4,5 , which are found isolated or as clusters across the genome, representing 1% to 2% of the total gene 30 content in the best studied mammals. Parent-of-origin (PofO) expression is primarily controlled by differentially 31 methylated regions (DMRs) in a parental way as well 1 . Although knowledge about GI has significantly advanced so 32 far, some technological bottlenecks remain to tackle challenging scientific insights.

33
To assess whether and how GI is involved in the variability of complex phenotypes, it is critical to (i) map and 34 characterize imprinted loci across the genome and (ii) identify simultaneously the parental origin of alleles and their 35 methylation status. Rigorously characterizing imprintomes would require the combination of experimental designs 36 such as reciprocal crosses 6 with genome-scale sequencing technologies 7,8 . However, such cost-consuming methods 37 could not be used as routine molecular tools. Here, we optimized and compared capture-based methylation 38 sequencing technologies aiming for imprinted loci across the genome.

39
We performed our study in the pig (Sus scrofa) because porcine GI is largely under-characterized, despite wide-40 ranging implications not limited to the improvement of major agronomical phenotypes 9,10 . The strategy implemented 41 below may be applied to any other species with its own custom capture. We (i) selected 165 regions in the pig genome 42 based on human and mouse orthologies 1,11 Fig. 1a-h). Both target capture efficiency and homogeneity of panels 51 2 are comparable between AG and TB after optimizing the latter, reaching excellent levels ( Fig. 1f and g). Specificity is 52 however more favourable in TB, with much less off-target capture than in AG ( Fig. 1h and i and Extended Data Fig.   53 1g-i). About methylation evaluation and conversion, the enzymatic-based TB technology yielded higher numbers of 54 total and methylated CpGs, as well as less non-CpG methylation than the standard bisulfite-based AG technology 55 (Extended Data Fig. 1j-n). In addition, we demonstrated better capture of GC-rich regions with TB technology, 56 including CpG islands, independently of region size (Fig. 1j-l). Thus, the application of the novel TB approach to GI 57 suggests it outperforms the current technological standard for methylation quantification 13 .

58
Imprinted genes are regulated by CpG methylation through parental DMRs 1,14 . Such hemi-methylated regions, 59 expected to be methylated on one allele resulting in approximately 50% of methylation, are either somatic or 60 germinal 15 . Here, we identified approximately 38,000 hemi-methylated CpGs per individual, clustered in at least 600

61
DMRs fulfilling stringent criteria that are distributed in 123 out of the 165 candidate regions (Fig. 2a-c). Interestingly, 62 the IGF2-H19/KCNQ1-CDKN1C region, carrying a mutation affecting muscularity in pigs 16 and hosting some of the 63 best-characterized Imprinting Control Regions (ICRs) in humans and mice 17 , is the top candidate after scanning for 64 GI methylation patterns. Two clusters with more than 100 hemi-methylated CpGs were detected in the region. The 65 first one is located upstream of the 5' UTR of H19 and the second one is located upstream of the 5'UTR of KCNQ1OT1 66 that is not annotated in the pig reference genome (Fig. 2e-h).

67
Our strategy relies on next generation sequencing technology that allows the detection of genotypes and CpG

127
Sixteen library preparations were carried out using an in-house combination of two protocols: NEB-Next Enzymatic

303
For c to k, the AG classical protocol is in green and the two TB protocols (TB1 and TB2) are in light and dark purple.