Author Correction: Massively parallel in vivo CRISPR screening identifies RNF20/40 as epigenetic regulators of cardiomyocyte maturation

Between birth and adulthood cardiomyocytes (CMs) undergo dramatic changes in size, ultrastructure, metabolism, and gene expression, in a process collectively referred to as CM maturation. The transcriptional network that coordinates CM maturation is poorly understood, creating a bottleneck for cardiac regenerative medicine. Forward genetic screens are a powerful, unbiased method to gain novel insights into transcriptional networks, yet this approach has rarely been used in vivo in mammals because of high resource demands. Here we utilized somatic mutagenesis to perform the first reported in vivo CRISPR genetic screen within a mammalian heart. We discovered and validated several novel transcriptional regulators of CM maturation. Among them were RNF20 and RNF40, which form a complex that monoubiquitinates H2B on lysine 120. Mechanistic studies indicated that this epigenetic mark controls dynamic changes in gene expression required for CM maturation. These insights into CM maturation will inform efforts in cardiac regenerative medicine. More broadly, our approach will enable unbiased forward genetics across mammalian organ systems.


INTRODUCTION
At birth, mammalian cardiomyocytes (CMs) undergo maturation, a dramatic and coordinated set of structural, metabolic, and gene expression changes that enable them to sustain billions of cycles of forceful contraction during postnatal life 1,2 . Fetal CMs are primarily glycolytic, mitotic, and contract against low resistance, whereas adult CMs rely on oxidative phosphorylation, are post-mitotic, and support heart growth by increasing in size. Sarcomeric and ultrastructural adaptations, such as plasma membrane invaginations known as T-tubules, facilitate synchronized and forceful CM contraction against high resistance. Unfortunately, CMs induced from stem cells or other non-myocyte sources resemble fetal CMs and lack the hallmark features of mature, adult CMs 3,4 . This "maturation bottleneck" remains a major barrier to using stem cell-derived CMs for disease modeling or therapeutic cardiac regeneration.
The regulatory mechanisms that govern the diverse facets of CM maturation are poorly understood, in large part due to the lack of a suitable in vitro model and challenges associated with in vivo approaches. Although mosaic gene manipulation strategies have allowed more precise interpretation of in vivo experiments with respect to the regulation of maturation [5][6][7] , the low throughput of standard in vivo approaches remains a major barrier. To overcome this obstacle, we sought to perform an in vivo forward genetic screen in mice. The resource intensity of traditional forward genetics has precluded their widespread use in mammals, but Cas9 mutagenesis directed by a library of guide RNAs (gRNAs) makes introduction and recovery of gene mutations highly efficient [8][9][10] , and can be expeditiously deployed mammals in vivo 5,11,12 . This capability has been used for forward genetic screens in cultured cells 10,13,14 , but its ability to interrogate endogenous biological processes in mammals in vivo has yet to be fully realized.

RESULTS: CRISPR loss-of-function screen identifies essential regulators of CM maturation
We developed an in vivo forward genetic screen to discover factors that regulate murine CM maturation (Fig. 1a). We employed CRISPR/Cas9 AAV9 (CASAAV) based somatic mutagenesis 5 and a gRNA library targeting murine transcriptional regulators to create thousands of distinct mutations within different CMs of a single mammalian heart. We screened these mutant CMs with a flow cytometry-based single cell assay of CM maturation. Sequencing of gRNAs from immature CMs compared to the input library identified gRNAs enriched or depleted in immature CMs, i.e. gRNAs that target genes that cell autonomously promote or antagonize CM maturation, respectively.
To separate individual CMs with an immature or mature phenotype, we took advantage of developmentally regulated sarcomere gene isoform switching, a hallmark of CM maturation 1,2 . In mouse, a prominent switch is from myosin heavy chain 7 (Myh7) expression in the fetal and neonatal periods to Myh6 in mature CMs. In Myh7 YFP mice 15 , YFP is fused to endogenous MYH7, such that YFP fluorescence is controlled by endogenous Myh7 regulatory elements 15 . We hypothesized that Myh7 YFP could be used as a single cell readout of CM maturation state. Examination of Myh7 YFP myocardium from neonatal and adult mice verified that neonatal CMs exhibited strong YFP fluorescence, whereas YFP was nearly completely silenced in adult CMs (Fig. 1b). Since mutation of both Gata4 and Gata6 impair CM maturation (Suppl. Fig. 1 and ref. 16 ), we next tested the effect of Gata4/Gata6 mutation on Myh7 YFP expression. We found that CASAAV-Gata4-Gata6, which expresses CM specific Cre and gRNAs targeting both Gata4 and Gata6 (Suppl. Fig. 1a), markedly increased the fraction of transduced CMs that expressed YFP from 11% to 86% (Fig. 1c). These data show that Gata4/6 mutation inhibits normal CM maturation, including maturational Myh7 silencing.
These 14671 gRNAs were synthesized as an oligonucleotide pool, cloned into the CM specific CASAAV vector, and packaged into AAV9 (see Methods). To introduce a positive control, a small amount of the CASAAV-Gata4-Gata6 vector was spiked into the AAV pool.
We subcutaneously injected the AAV library into 45 newborn R26 Cas9-GFP/+ ;Myh7 YFP/+ pups at a dose sufficient to transduce approximately 50% of the myocardium. At four weeks of age mice were sacrificed and CMs isolated. 15% of the isolated CMs from each heart were set aside as an unsorted input sample, while YFP + CMs were sorted from the remaining 85% via flow cytometry (Suppl. Fig. 2a).
CMs from three hearts were pooled for sorting, resulting in 15 input and 15 YFP + samples. Following RNA isolation, gRNAs were specifically reverse transcribed, converted to barcoded amplicon libraries, and sequenced to an average depth of 4.8M reads. 11 YFP+ and 14 input samples passed quality control (Suppl. Fig. 2b-f), and these samples separated into distinct clusters (Fig. 1d). 834 gRNAs showed significant enrichment within YFP + samples (adj. P < 0.001), with the Gata4 and Gata6 positive control gRNAs being among the most enriched (Fig. 1e). The seven human-targeting negative control gRNAs did not show enrichment (Fig. 1f). We used MaGeCK 19 to consolidate the six gRNA enrichment scores per gene into a single score. gRNAs targeting 123 genes were significantly enriched within YFP + CMs, while gRNAs targeting 148 genes were depleted (P < 0.05). The top ranked depleted gene was Myh7 (Fig. 1g), which was expected given that YFP fused to Myh7 was the screen readout. Among the enriched genes were thyroid hormone receptor alpha (Thra) and nucleolin (Ncl), which are established regulators of maturation 20,21 , as well as many novel candidates ( Fig. 1h; Suppl. Table 2), including both Rnf20 and Rnf40, which encode components of an epigenetic complex that monoubiquitinates histone 2B 22,23 .
To validate the top 10 most enriched candidates (Fig. 2a), the most highly enriched gRNA for each factor was cloned into a CASAAV vector and used to individually deplete each factor at birth. To focus on cell autonomous effects and avoid secondary effects related to organ dysfunction 5-7 , we used an AAV dose that transduced a small fraction of CMs. Among mosaic depleted cells, we observed robust upregulation of Myh7 YFP for seven of the ten candidates: Rnf40, Rnf20, Taf3, Taf2, Thra, Zbtb7a and Ncl (Fig. 2a,b; Suppl. Fig. 3a). CASAAV vectors targeting the remaining three candidates, Poc1b, Setd6, and Eif3i, caused little YFP activation (Fig. 2a,b, Suppl. Fig. 3a). Consequently, we focused subsequent analyses on the seven target genes that resulted in robust upregulation of Myh7 YFP .
We analyzed the effect of individual CASAAV vectors on CM nucleation, size, and T-tubulation, additional hallmarks of maturation. Immature murine CMs are mononuclear and become predominantly binuclear during maturation 24 , and this polyploidization has been implicated in the reduced regenerative potential of mature CMs [25][26][27] . CMs markedly increase in cell size between birth and adulthood 28 , as Zbtb7a increased mononucleation and impaired T-tubulation, but did not influence maturational hypertrophy. These results validated 7 of the 10 candidates as regulators of multiple facets of CM maturation, and also demonstrate that the overall maturational program can be separated into independently regulated, dissociable sub-programs.

RNF20/40 depleted CMs display morphological and transcriptional characteristics of immaturity
Rnf20 and Rnf40 were two of the most enriched genes in the screen, and their depletion broadly impaired CM maturation. These genes are E3 ubiquitin ligases that together form a complex that monoubiquitinates histone 2B at lysine 120 (H2Bub1) 30,31 . Human genetic studies have implicated de novo Rnf20/Rnf40 mutations in congenital heart disease 32,33 . The postnatal cardiac functions of these genes have not been studied. For these reasons, we investigated the mechanisms by which RNF20/40 regulate CM maturation. We created a single CASAAV vector (CASAAV-RNF20/40) containing gRNAs that target both factors and validated that it depleted RNF20/40 and H2Bub1 ( Fig. 3a,b). When given at a dose that transduced most CMs, CASAAV-RNF20/40 caused cardiac dysfunction and death (    (Fig. 4b), and differential gene expression analysis revealed approximately 1400 upregulated and 1100 downregulated genes (Padj < 0.05, Fig. 4c, and Suppl. Table 3). The ratio of fetal (Tnni1) to mature (Tnni3) troponin I isoforms, a molecular signature of CM maturational state 34 , was increased by 165-fold in RNF20/40 depleted CMs (P < 0.001), with Tnni1 being the most upregulated gene in the dataset. To assess the genome wide impact of RNF20/40 depletion we used RNA-sequencing from neonatal and adult isolated wildtype CMs to construct custom gene sets consisting of the 100 most neonatal specific, and 100 most adult specific genes. We then ranked all genes by their level of differential expression in the RNF20/40 depleted CMs, and used Gene Set Enrichment Analysis (GSEA) 35 to calculate an enrichment score for each custom gene set within this ranked list. The genes downregulated in RNF20/40 depleted CMs were highly enriched for adult-specific genes (NES = -1.82; Fig. 4d), while the genes upregulated in the depleted CMs were enriched for neonatalspecific genes (NES = 1.50; Fig. 4e). These data confirm that RNF20/40 depleted cells fail to activate the transcriptional network of mature CMs, but instead persistently express genes associated with immaturity.
We next analyzed the RNA-seq data to identify biological processes enriched within the RNF20/40 differentially expressed genes. Many metabolism-related gene sets were enriched within the downregulated genes (Fig. 4f, Suppl. Table 5), including Fatty Acid Metabolism and PPAR Signaling, two pathways strongly associated with CM maturation 36,37 . Upregulated genes were enriched for a greater diversity of biological processes, with spliceosome-associated genes being most enriched (Fig.   4f).

H2Bub1 deposition directly regulates transcriptional maturation
The RNF20/40 complex is an E3 ubiquitin ligase that monoubiquitinates H2B on lysine 120 (H2Bub1). To determine how H2Bub1 deposition by RNF20/40 promotes normal maturation of CM gene expression, we used chromatin immunoprecipitation followed by next generation sequencing (ChIP-seq) to determine the genomic distribution of H2Bub1 at neonatal (P1) and mature stages (P28) in mouse heart apex tissue. Consistent with prior reports 38 , H2Bub1 predominantly occupied gene bodies, with greater density towards promoters. Thus, we quantified the amount of H2Bub1 deposition at each gene and observed reproducible signal above input at approximately 11,400 genes in the neonatal stage and 11,800 genes in the adult stage. Of these genes, the vast majority (>88%) were shared between timepoints (Fig. 5a), with a small number of unique regions (Fig. 5b,    consistent with the reported association of H2Bub1 with gene activation (Fig. 5d). Strikingly, genes with greater H2Bub1 were less likely to substantially change expression between stages (Fig. 5e).
This stabilizing effect of H2Bub1 was also observed when comparing RNF20/40 depleted CMs to negative controls (Fig. 5f), suggesting that high H2Bub1 induces additional epigenetic mechanisms that limit changes in gene expression.
Next we used GSEA to identify biological processes containing genes preferentially marked by H2Bub1. Several functional terms were associated with strong H2Bub1 signal at the neonatal stage (Nom. Pval < 0.001, NES > 1.75), with Ribosome and Oxidative Phosphorylation being the top two (Fig   5g,h). While most genes highly marked by H2Bub1 have a stable expression profile during maturation ( Fig. 5e), we observed that genes within several of the identified functional groups are dynamically expressed during maturation, and differentially expressed in RNF20/40 CMs (Fig. 5i). This included Oxidative Phosphorylation and TCA Cycle gene sets, suggesting that H2Bub1 deposition is directly required for activation of the adult metabolic gene profile.

DISCUSSION
In this study we developed a resource efficient in vivo forward genetic screen for transcriptional regulators of CM maturation. Our strategy uses a massively parallel approach in which the screening unit is the individual cell, such that thousands of individual genetic mutations can be screened using a small number of animals. Among the many novel genes that our screen identified as essential for CM maturation were Rnf20 and Rnf40, which interact to form a complex that mono-ubiquitinates histone-2B. Depletion of Rnf20/40 broadly impaired CM maturation by disrupting H2Bub1 deposition, which prevented normal transcriptional changes essential for maturation. While Rnf20/40 regulate diverse gene sets, one common theme in our studies was regulation of the metabolic remodeling that takes place during CM maturation. GSEA identified lysine degradation and fatty acid metabolism as the two genesets most downregulated in RNF20/40 depleted CMs, while an independent GSEA based on H2Bub1 deposition levels identified oxidative phosphorylation and TCA cycle gene sets as preferentially marked.
Mutations in pathways that regulate H2Bub1, including Rnf20 and Rnf40, have been reported in humans with congenital heart defects (CHD) 32,33 and were recently reported to influence cardiac looping via regulation of motile cilia during early embryonic development 32 . Our results extend the function of Rnf20/40 into the postnatal heart, suggesting that mutations in these genes could both cause a structural heart defect and contribute to cardiac dysfunction after surgical repair. CHD survivors have a greatly increased risk of cardiovascular disease later in life 39 , with outcomes being diverse and difficult to predict. As there is substantial diversity in the range of CHD genes, additional research will be necessary to characterize the role of these factors in CM maturation, to discover if functional declines are due to cell autonomous maturational defects, and to inform innovations in intervention.

MATERIALS AND METHODS: Library Design and Construction
A list of 3000+ potential transcription factors and epigenetic regulators was compiled using publicly available data from the Riken Transcription Factor Database 40 Table 6 This library pool was packaged into AAV9 via PEI transfection of ten 15cm plates of HEK293T cells, and titered by qPCR targeting the TnT promoter, as previously described 5,42 .

NGS Library Preparation
Reverse transcription of gRNAs and adapter addition was achieved by using a custom protocol with the Clontech SMART-Seq v4 Ultra Low Input RNA sequencing kit (634894). Approximately 30ng of RNA from YFP + CMs or 500ng of RNA from Input CMs was reverse transcribed as directed in the manufacture protocol except that SMART-Seq CDS Primer IIA was replaced with a gRNA scaffold specific primer (Supplementary Table 7). This reverse transcription step also utilizes a template switching oligo to add an adapter of known sequence to the variable 5' end of the gRNA 43 . cDNA was then amplified in two sequential rounds of PCR to add NGS sequencing adapters. In the first round of amplification the full length forward read adapter was added to the 5' end of the gRNA and a half adapter was added to the 3' end (Supplementary Table 7). NEB Phusion (M0530L) was used for 5 cycles of amplification, according to standard manufacturer protocol. First round PCR product was purified via Zymo DNA Clean and Concentrator column, eluted in 25ul of H2O, and 5ul used as input in a 20 cycle second round of amplification, which completed the reverse adapter and added a sample specific Illumina TruSeq multiplexing index. The resulting 220bp amplicon was purified via Invitrogen SizeSelect 2% gel. The concentration of each sample was assessed using the KAPA Library Quantitation Kit (KR0405), allowing samples to be evenly pooled and submitted for single end 75bp sequencing on a NextSeq500. An average sequencing depth of 4.8 million reads per sample was achieved.

Screen NGS Analysis
gRNA NGS libraries were trimmed to remove adapters and the gRNA scaffold, leaving only the 20bp variable region. Bowtie2 was used to align trimmed sequence files to mm10. Counts for each gRNA were acquired by quantifying sequence coverage of genomic regions corresponding to the gRNA library via Bedtools Coverage. Differential expression of individual gRNAs in YFP + versus Input samples was calculated by using gRNA counts as input for DESeq2 44 . Differential gene representation in YFP + versus Input samples was calculated using median normalized gRNA counts as input for the MAGeCK software package 19 , which consolidates scores for multiple individual gRNAs targeting the same gene into a single gene level enrichment score. Five samples were removed prior to MAGeCK analysis due to insufficient enrichment of control gRNAs, improper clustering, or poor library coverage (Suppl. Fig. 2).

In Situ T-tubule Imaging
CASAAV virus targeting Rnf20 and Rnf40 for double depletion was injected subcutaneously at P1 into R26 Cas9/+ ;Myh7 YFP/+ pups. Animals were sacrificed at P28 and hearts were perfused with the Ttubule binding dye FM 4-64 (Thermo, T3166) at 5uM for 20 minutes, followed by imaging of on an Olympus FV3000R confocal microscope, as previously described 5 . Organization and abundance of transverse and longitudinal T-tubule elements was quantified using AutoTT software 45 . In total 79 YFPand 69 YFP + CMs, originating from three mice, were quantified.

Immunostaining
Immunostaining was conducted as previously described 42 . Briefly, freshly isolated adult CMs were cultured on laminin coated (2ug/cm 2 , Life Technologies, 23017015) 12mm glass coverslips in a 24-well dish with DMEM plus 5% FBS and 10uM Blebbistatin (EMD Millipore, 203390) at 37°C. After allowing 30 minutes for cells to adhere to the laminin, CMs were fixed with 4% PFA for 10 minutes at room temperature, followed by permeabilization with PBST (PBS + 0.1% Triton) for 10 minutes at room temperature. Cells were then blocked with 4% BSA/PBS for 1hr, and incubated with primary antibody (Supplementary Table 8

Echocardiography
Echocardiography was performed on a VisualSonics Vevo 2100 machine with the Vevostrain software. Animals were awake during this procedure and held in a standard handgrip. The echocardiographer was blinded to genotype and treatment.

RNA-sequencing
One day old R26 fsCas9-2A-GFP/+ ;Myh7 YFP/+ pups were subcutaneously injected with 50ul of CASAAV, containing either the most enriched Rnf20 targeting gRNA and the most enriched Rnf40 targeting gRNA in a single vector (CASAAV-Rnf20-Rnf40), or a control vector containing Cre but no gRNAs. Vectors were injected at a concentration of 1x10 11 vp/ml, which was sufficient for ~20%  Table 5). For construction of custom gene sets RNA-sequencing data from P0 and P7 isolated mouse CMs (GEO -GSE124008) was averaged to get a "neonatal" RPKM expression value, and data from 4 week and 6 weeks of age averaged to get an "adult" value. Genes were then ranked based on the adult to neonatal ratio, and the highest 100 genes selected for an "adult specific gene set" and the lowest 100 for a "neonatal specific gene set" (Suppl. Table 4).  vector. AAV genome encoded CM specific Cre and gRNAs that target Gata4 and Gata6. We administered CASAAV-Gata4-Gata6 at a low dose to postnatal day 1 (P1) mouse pups carrying Cre-activatable Cas9-2A-GFP and Myh7 YFP (R26 Cas9-GFP; Myh7 YFP ). b, GATA4 and GATA6 depletion efficiency. We fixed hearts at P10 for GATA4 and GATA6 immunostaining. Representative section stained for GATA4 is shown. GFP marked transduced cells. c, Quantification of GATA4 and GATA6 depletion. Greater than 90% and 60% of the transduced CMs lost GATA4 and GATA6 immunoreactivity, respectfully. d-g, Quantification of size and dimensions in isolated P28 CMs, when maturation is normally largely complete. h, Imaging of the T-tubule network by FM4-64 staining showed dramatic defects. i, Representative image of α-actinin (ACTN2) immunostaining. j, Quantification of sarcomere organization. k, Quantification of sarcomere spacing.

Excluded Sample
Reason for Exclusion