Limits in the detection of m6A changes using MeRIP/m6A-seq

McIntyre, Alexa B. R.; Gokhale, Nandan S.; Cerchietti, Leandro; Jaffrey, Samie R.; Horner, Stacy M.; Mason, Christopher E.

doi:10.1038/s41598-020-63355-3

Download PDF

Article
Open access
Published: 20 April 2020

Limits in the detection of m⁶A changes using MeRIP/m⁶A-seq

Alexa B. R. McIntyre^1,2,
Nandan S. Gokhale³,
Leandro Cerchietti⁴,
Samie R. Jaffrey⁵,
Stacy M. Horner^3,6 &
…
Christopher E. Mason^1,7,8,9

Scientific Reports volume 10, Article number: 6590 (2020) Cite this article

19k Accesses
116 Citations
12 Altmetric
Metrics details

Subjects

Abstract

Many cellular mRNAs contain the modified base m⁶A, and recent studies have suggested that various stimuli can lead to changes in m⁶A. The most common method to map m⁶A and to predict changes in m⁶A between conditions is methylated RNA immunoprecipitation sequencing (MeRIP-seq), through which methylated regions are detected as peaks in transcript coverage from immunoprecipitated RNA relative to input RNA. Here, we generated replicate controls and reanalyzed published MeRIP-seq data to estimate reproducibility across experiments. We found that m⁶A peak overlap in mRNAs varies from ~30 to 60% between studies, even in the same cell type. We then assessed statistical methods to detect changes in m⁶A peaks as distinct from changes in gene expression. However, from these published data sets, we detected few changes under most conditions and were unable to detect consistent changes across studies of similar stimuli. Overall, our work identifies limits to MeRIP-seq reproducibility in the detection both of peaks and of peak changes and proposes improved approaches for analysis of peak changes.

Three million images and morphological profiles of cells treated with matched chemical and genetic perturbations

Article Open access 09 April 2024

Inferring gene regulatory networks from single-cell multiome data using atlas-scale external data

Article Open access 12 April 2024

Tissue-specific enhancer–gene maps from multimodal single-cell data identify causal disease alleles

Article 09 April 2024

Introduction

Methylation at the N6 position in adenosine (m⁶A) is the most common internal modification in eukaryotic mRNA. A methyltransferase complex composed of METTL3, METTL14, WTAP, VIRMA, and other cofactors catalyzes methylation at DRACH/DRAC motifs, primarily in the last exon^1,2. Most m⁶A methylation occurs during transcription³. The modification then affects mRNA metabolism through recognition by RNA-binding proteins that regulate processes including translation and mRNA degradation^4,5,6,7,8,9. However, whether m⁶A is lost and gained in response to various cellular changes remains contentious^{3,10,11,12,13,14,15}. To assess the evidence for proposed dynamic changes in m⁶A, a reliable and reproducible method to detect changes in methylation as distinct from changes in gene expression is necessary.

The first and most widely-used method to enable transcriptome-wide studies of m⁶A, MeRIP-seq or m⁶A-seq, involves the immunoprecipitation of m⁶A-modified RNA fragments followed by peak detection through comparison to background gene coverage^16,17. A second method was developed in 2015, miCLIP or m⁶A-CLIP, which involves crosslinking at the site of antibody binding to induce mutations during reverse transcription for single-nucleotide detection of methylated bases^2,18. MeRIP-seq is still more often used than miCLIP, despite less precise localization of m⁶A to peak regions of approximately 50–200 base pairs that can contain multiple DRAC motifs, since it follows a simpler protocol, requires less starting material, and generally produces higher coverage of more transcripts. Antibodies for m⁶A can also detect a second base modification, N⁶,2′-O-dimethyladenosine (m⁶A_m), found at a lower abundance than m⁶A and located at the 5′ ends of select transcripts^15,18. We thus refer to the base modifications detected through MeRIP-seq collectively as m⁶A_(m), although most are likely m⁶A. As of late 2018, over fifty studies used MeRIP-seq to detect m⁶A_(m) in mammalian mRNA (Supplementary Table 1).

Although MeRIP-seq can reveal approximate sites of m⁶A_(m), it cannot be used to quantitatively measure the fraction of transcript copies that are methylated¹⁹. Studies of m⁶A variation in response to stimuli instead estimate differences at individual loci through changes in peak presence or peak height. Using these approaches, studies have reported changes to m⁶A with heat shock, microRNA expression, transcription factor expression, cancer, oxidative stress, human immunodeficiency virus (HIV) infection, Kaposi’s sarcoma herpesvirus (KSHV) infection, and Zika virus infection, including hundreds to thousands of changes in enrichment at specific sites^{20,21,22,23,24,25,26,27,28,29}. Statistical approaches to analysis have only recently been published and there have been no comprehensive evaluations of methods to detect changes in m⁶A based on MeRIP-seq data^30,31. Thus, while studies have suggested that m⁶A shows widespread changes in response to diverse stimuli, they have applied inconsistent analysis methods to detect changes in m⁶A and often don’t control for differences in RNA expression between conditions or typical variability in peak heights between replicates. In some cases, these studies have reported m⁶A changes based on simple differences in peak count^24,26,27,32. However, others have applied statistical tests or thresholds for differences in immunoprecipitated (IP) over input fraction enrichment and visual analysis of coverage plots, and have reported fewer m⁶A changes or suggested that m⁶A is a relatively stable mark^33,34. As in RNA-seq, there is noise in MeRIP-seq, and multiple replicates are therefore necessary to estimate variance and statistically identify the effects of experimental intervention^35,36,37. To date, only one MeRIP-seq study has used more than three replicates per condition³⁴, while ten have used only one^{17,20,32,33,38,39,40,41,42,43}, suggesting that most studies may not have enough power to detect changes in m⁶A_(m).

To re-evaluate the evidence for m⁶A_(m) changes under various conditions, we first examined the variability in m⁶A_(m) detection across replicates, cell lines, and experiments using our own negative controls (12 replicates) as well as 24 published MeRIP-seq data sets. We then compared statistical methods to detect differences in IP enrichment using biological negative and positive controls for m⁶A changes. We found that these methods are limited by noise, including biological variability from changes in RNA expression and technical variability from immunoprecipitation and sequencing that limits reproducibility across studies. Our results suggest that the scale of statistically detectable m⁶A_(m) changes in response to various stimuli is orders of magnitude lower than the scale of changes reported in many studies. However, we also found that statistical detection could miss the majority of changed sites when using only 2–3 replicates. We use our results to propose approaches to MeRIP-seq experimental design and analysis to improve reproducibility and more accurately measure differential regulation of m⁶A_(m) in response to stimuli. These data and analyses emphasize the need for further research and alternative assays, for example recently developed endoribonuclease-based sequencing methods^44,45 or direct RNA nanopore sequencing⁴⁶, to resolve the extent to which m⁶A changes in response to specific conditions.

Results

Detection of peaks across replicates, experiments, and cell types

The first steps in MeRIP-seq data analysis are to align sequencing reads to the genome or transcriptome of origin and to identify peaks in transcript coverage in the IP fraction relative to the input control. Several methods have been developed for MeRIP-seq peak detection, including exomePeak, MeTPeak, MeTDiff, and bespoke scripts. Another method often used for MeRIP-seq peak detection is MACS2, which was originally designed to detect protein binding sites in DNA from chromatin immunoprecipitation sequencing (ChIP-seq). We compared m⁶A_(m) peak detection by exomePeak, MeTPeak, MeTDiff, and MACS2^31,47,48,49 in seven replicates of MeRIP-seq data obtained from mouse cortices under basal conditions³⁴, and in 12 replicates of MeRIP-seq data we generated from human liver Huh7 cells⁵⁰. The intersect between all tools tested was high, and we saw minimal differences in DRAC motif enrichment, which we use to provide an estimate of tool precision in the absence of true positive m⁶A sites (Supplementary Fig. 1a). In addition, we assessed the METTL3/METTL14-dependence of specific peaks identified by single tools using MeRIP-RT-qPCR. We found that of these peaks, 4/4 from MACS2, 5/5 from MeTPeak, and 4/5 from MeTDiff showed decreased m⁶A_(m) enrichment following METTL3/METTL14 depletion, suggesting that these are true m⁶A sites. By comparison, only 1/5 of the peaks uniquely called by exomePeak showed statistically significant decreases (p < 0.05), although replicate variance was high and 4/5 showed a downward trend (Supplementary Fig. 1b). Since MACS2 was the most commonly used tool for peak calling and was previously found to perform well in comparison with a graphical user interface tool and several other peak callers⁵¹, we used MACS2 for the remainder of our analyses. Repeating the analyses shown in Figs. 2, 3, 4 using the MeTDiff peak caller instead of MACS2 did not affect any of our conclusions (Supplementary Figs. 2–4).

For m⁶A_(m) peak detection, a transcript must be sufficiently expressed for enrichment by the m⁶A_(m) antibody and for adequate sequencing coverage in both the IP and input fractions. Previous reports have suggested that m⁶A_(m) presence does not decrease with lower mRNA expression level, and, if anything, is higher in mRNAs with lower expression as methylated transcripts tend to be less stable^9,38. Peak callers, however, identify fewer peaks in genes at low expression, which we therefore assume reflects inadequate coverage for peak calling. To estimate the level of coverage necessary for peak detection, we analyzed the percent of genes with at least one, two, or three peaks relative to mean input transcript coverage in both the mouse cortex and Huh7 cell data (Fig. 1a). Based on the upper shoulders of the sigmoidal curves as the percent of genes with peaks begins to plateau, we estimate that mean gene coverage of approximately 10–50X is necessary to avoid missing peaks based on insufficient coverage. Including a wider array of samples in this analysis likewise showed an increase in the percent of transcripts with ≥1 peak as coverage rose to 10×(Supplementary Fig. 1c). Our analysis of the input RNA-seq coverage of peak regions alone again supported a similar threshold; few peaks are detected with median input read counts below 10 across replicates (Supplementary Fig. 1d). These thresholds do not mean that peaks in genes with mean coverage <10X or peaks with fewer than 10 input reads are false positives, but that the likelihood of false negatives rises with lower coverage (Supplementary Fig. 1e).

To evaluate the reproducibility of MeRIP-seq data, we next examined the consistency of m⁶A_(m) peak calling between replicates. Previous studies have reported that peak overlap between replicates is approximately 80%^9,16,52,53. Similarly, we found that between two replicates, log₂ fold enrichment of IP over input reads at detected peaks showed a Pearson correlation of approximately 0.81 to 0.86 (Supplementary Fig. 1f). A single sample captured a median of 78% of the peaks found in seven replicates of mouse cortex data and 66% of peaks found in twelve replicates of Huh7 cell data. The number of detected peaks increased log-linearly with the addition of more replicates, such that with three replicates, 84–92% of the peaks found with 7–12 replicates were detected (Fig. 1b). Conversely, the number of peaks in common across replicates decreased as the number of replicates increased, such that while ~80% of peaks were detected in at least two replicates, only ~60% were detected in six replicates for both data sets and ~25% in all twelve replicates of Huh7 cell data (Fig. 1c). Detection of peaks in more replicates did not increase DRAC motif enrichment (Supplementary Fig. 1g). These results suggest that many m⁶A_(m) sites may be missed in studies that use one to three replicates, and that increasing replicates could enable detection of more peaks. However, not all peaks correspond to true m⁶A_(m) sites. A recent comparison to data from an endoribonuclease-based method for m⁶A detection suggested MeRIP-seq has a false positive rate of ~11%, although this would differ by study and detection threshold^3,45.

The number of peaks detected across studies varies. Given that coverage affects peak detection, we hypothesized that variation in sequencing depth could contribute to differences in peak count. Zeng et al. (2018) reported that peak count begins to saturate by around 20 million reads by subsampling data within individual studies⁴². However, we found that there is no positive correlation between peak count and input or IP sequencing depth across data sets from different published studies, each of which had 3–81 M reads per replicate (input Pearson’s R = −0.37, p = 0.015; IP Pearson’s R = −0.17, p = 0.28) (Supplementary Table 2, Supplementary Figure 5a,b). This implies that experimental factors beyond sequencing depth contribute to the variability of peak counts across studies.

We next analyzed the overlap of peaks among studies and found inconsistency in peak localization on transcripts as well. Within four commonly used cell types, the percent of peaks detected in one experiment that were also detected in a second varied among pairs of studies from as low as 2% of peaks to as high as 90% (median = 45%), after filtering for transcripts expressed above a mean of 10X input coverage in both to ensure sufficient expression for peak detection (Fig. 2a). In fact, peaks showed higher overlap within different cell types from the same study than within the same cell type from different studies, suggesting that MeRIP-seq data is prone to strong batch effects (Fig. 2b). While this could be due to differences among experimental protocols used (summarized in Supplementary Table 2), we were unable to identify such a link. Overall, most percent overlaps of m⁶A_(m) peaks fell between ~30% (1^st quartile) and ~60% (3^rd quartile) (Fig. 2b). With rare exceptions (e.g. that described by Ke et al., 2017 in their Supplementary Fig. 8)³, most MeRIP-seq data sets do show enrichment of the m⁶A motif DRAC. These results indicate, however, that multiple labs running MeRIP-seq on the same cell type will detect different subsets of m⁶A_(m) sites. Possible contributing factors in the differences among studies include cell state (e.g. different stages of the cell cycle), experimental conditions, and sequencing depth. Despite predictions that tissue or cell type would be a large factor in differences among samples, though, peaks detected in different tissues analyzed in a single experiment showed high overlap and little clustering by tissue type (Fig. 2c)⁵⁴. This suggests that although there is evidence that m⁶A levels vary by tissue¹⁹, modified sites are consistent.

Detection of changes in peaks between conditions

Following m⁶A_(m) peak detection, many studies compare the expression of peaks between two conditions to predict peak changes. While looking at plots of IP and input gene coverage under different conditions can help evaluate the evidence for these changes³³, statistical or heuristic methods are first necessary to narrow down a list of candidate sites to plot. Several tools used for statistical analysis by the studies in Supplementary Table 1 or for other types of RNA IP sequencing assays model peak counts using either (a) the Poisson distribution, in which the variance of a measure (here, read counts) is assumed to be equal to the mean (MeTDiff), or (b) the negative binomial distribution, in which a second parameter allows for independent adjustment of mean and variance (QNB and two implementations of a generalized linear model approach using DESeq. 2 or edgeR, Table 1)^{30,31,55,56,57}. In the mouse cortex and Huh7 cell data, we found that, similar to RNA-seq data^24,56,58, the variance in read counts under peaks exceeded their mean, indicative of overdispersion (Supplementary Fig. 6a). The log likelihood (the probability of an observation given a distribution with known parameters) for our sample also fell within the distribution of expected log likelihoods for the negative binomial distribution (bottom) but not the Poisson distribution (top) (Fig. 3a). Thus, the negative binomial distribution captures the mean-variance relationship in MeRIP-seq data, suggesting that tools that account for overdispersion better model the distribution of read counts at m⁶A_(m) peaks than tools that do not.

Table 1 Statistical methods for the detection of peak changes.

Full size table

We next defined positive and negative controls to evaluate tool performance for detection of changes in m⁶A_(m) peaks. Past publications describing new methods to detect m⁶A_(m) peak changes have used data sets in which methylation machinery genes or the methyl donor were disrupted compared to baseline conditions as positive controls, and have simulated negative controls by randomly swapping labels in the positive controls^30,31. However, swapping labels for conditions that may feature differences in gene expression in addition to differences in m⁶A levels could unrealistically increase variance in read counts within groups. Therefore, we instead used the two data sets from mouse cortex and Huh7 cells, which each comprised many replicates at baseline conditions (n = 7 and n = 12, respectively), as negative controls. We randomly divided the mouse cortex data into two groups of three to four replicates for comparison and divided the Huh7 replicates by lab of incubation, which did not affect sample clustering (Supplementary Fig. 6b). We would expect to see minimal changes in IP enrichment at m⁶A peaks between groups for our negative controls, whereas our positive controls, which featured genetic or chemical interference with the m⁶A machinery, should show discernible differences in peaks when compared to baseline or wildtype conditions in the same cell lines (summarized in Supplementary Table 3). Indeed, the absolute difference in log₂ fold change between peaks and genes was centered around 0 for the negative controls and showed small shifts that varied in magnitude and direction for the positive controls (Supplementary Fig. 6c).

Using statistical methods to detect changes in peak enrichment, we found that the percent of changes called below a p-value threshold of 0.05 were similar in the positive and negative controls (Fig. 3b). With all tools except MeTDiff, a knockout of Mettl3 showed the largest effects on m⁶A⁵⁹, while fewer significant peaks in other positive controls suggested variable effects of the positive control conditions on m⁶A_(m), possibly related to efficiency of the methylation machinery knockdown or overexpression^{7,33,60,61,62,63,64}. In the absence of true differences between groups, p-value distributions should be uniform for well-calibrated statistical tests, meaning that ~5% of peaks should have p-values <0.05 for the negative controls. MeTDiff reported an excess number of sites with p-values below 0.05 (Supplementary Fig. 6d) and identified a higher percentage of sites as differentially methylated in the mouse cortex negative control data set than in all but two positive controls (Fig. 3b). On the other hand, the generalized linear models (GLMs) and QNB showed uniform to conservatively shifted p-value distributions, with differences between the mouse cortex and Huh7 data sets (Supplementary Fig. 6d), suggesting that these tools detect fewer false positives.

To ensure significant peak changes detected by each of the tools reflected changes in IP enrichment independent of differential gene expression, we measured the correlation between changes in IP read counts at peak sites and changes in input read counts across their encompassing genes. For significant peaks (FDR-adjusted p-value <0.05) from the positive controls, correlation between log₂ fold change in peak IP and gene input read counts was low for the GLMs and QNB (Pearson’s R = 0.10 to 0.22) but reached 0.55 (p = 5.8E-87) for MeTDiff (Fig. 3c). The higher correlation for MeTDiff was driven by peaks with proportional changes in IP and input levels, which suggests that MeTDiff often detects differential expression of methylated genes rather than differential methylation. Therefore, published studies that have used MeTDiff may actually be detecting differential expression and not differential methylation^22,65. Indeed, plotting coverage for genes reported as differentially methylated in one of these studies, with the y-axis scaled separately per condition, confirmed that changes in m⁶A identified by MeTDiff were proportional to changes in gene expression (Fig. 3d)²². Given these results, QNB or the GLM implementations are better methods than MeTDiff to detect differential methylation. Taking the intersect of significant peaks for the GLMs and QNB may help determine the most probable sites of m⁶A changes, while taking the union of predictions provides a less conservative approach to selecting sites for further validation (Fig. 3e). However, additional filters are needed for robust peak change detection as there were still significant peaks for which the difference between peak log₂ fold change and gene log₂ fold change was close to zero, particularly with QNB (Supplementary Fig. 6e). For microarray and RNA-seq data, a filter of absolute log₂ fold change >1 has been recommended to reduce false positive rates⁶⁶; in the remainder of our analyses, we implemented a similar filter for absolute difference in peak and gene log₂ fold change ≥1 to the combined predictions from QNB and the two GLMs, with an additional filter where noted for peak read counts ≥10 across all replicates and conditions to ensure sufficient coverage for consistent peak detection (as discussed in Fig. 1a).

Reanalyzing peak changes between conditions

We next estimated the scale of statistically detectable peak changes under various conditions using our approaches and compared these results to previously reported estimates of these changes (Fig. 4a, Supplementary Table 4). We identified fewer peaks as differentially methylated than originally reported under most conditions, with zero to hundreds of peaks significantly changed (depending on experiment and method), versus hundreds to over ten thousand described in publications^{22,23,24,25,26,34,62,65,67,68,69,70}. Notably, knockdown of Zc3h13 did appear to disrupt m⁶A_(m), suggesting the gene does participate in methylation as recently described⁶⁸. Another study reported that activin treatment of human pluripotent stem cells led to differential methylation of genes that encode pluripotency factors²². However, our reanalysis only found a few peak changes that passed our filters for significance, fold change, and expression (minimum input read count across peaks ≥10), and no enrichment for pluripotency factors among affected genes. Even when we removed the thresholds for fold change and expression, the adjusted p-value for “signaling pathways regulating pluripotency of stem cells” was still 0.15 and driven by only three genes, LEFTY2, FZD28, and FGFR3 (Supplementary Fig. 7a). Interestingly, the minimum read threshold made a particularly dramatic difference in the case of a recent study that looked at the effects of knocking down the histone methyltransferase SETD2 on m⁶A in mRNA. For this data, of the 2065 sites predicted by QNB, 2064 fell below the minimum read threshold due to low input coverage in the first and second replicates (Fig. 4a, Supplementary Fig. 7b–e)⁶⁹. We could not compare our approach to results reported by Su et al. (2018), who found 6,024 peaks changed with R2HG treatment, Zeng et al. (2018), who found 465–599 peaks changed between tumour samples, or Ma et al. (2018), who found 12,452 peaks were gained and 11,192 lost between P7 and P20 mouse cerebella, as each relied on a single sample per condition, with no replicates^40,41,42.

Multiple studies have investigated m⁶A_(m) in the context of heat shock, HIV infection, KSHV infection, and dsDNA treatment or human cytomegalovirus (HCMV) infection (Supplementary Table 5). Since each step in MeRIP-seq analysis risks introducing false negatives, we cannot rule out consistent changes between studies that used similar experimental interventions based on statistical detection alone. Therefore, we started by plotting coverage for specific genes reported as differentially methylated to evaluate reproducibility across these studies. Zhou, et al. (2015) reported 5′ UTR methylation of Hspa1a with heat shock²⁰. Coverage was too low for untreated controls to determine if Hspa1a was simply newly expressed or was actually newly methylated with heat shock based on our alignment of their data using STAR⁷¹. We were also unable to detect a change in methylation of HSPA1A using data from other heat shock studies, including a new data set from a B-cell lymphoma cell line and a published miCLIP data set, although coverage was again low (Fig. 4b)^4,17. Lichinchi, et al. (2016) reported that 56 genes showed increased methylation with HIV infection in MT4 T-cells, with enrichment for genes involved in viral gene expression²⁵. Specific genes, for example PSIP1, in which we also detected a peak using MACS2 and see a change in the peak when plotting coverage using the data from Lichinchi et al. (2016), did not show the same changes in data from two other CD4⁺ cell types, primary CD4⁺ cells and Jurkat cells (Fig. 4c)⁷². Two other studies both used MeRIP-seq to establish the presence of m⁶A in IFNB1 induced through dsDNA treatment or by infection with the dsDNA virus HCMV^73,74. While these studies did not discuss changes in m⁶A, we used these data sets to examine the replicability of m⁶A_(m) changes in response to dsDNA sensing and interferon induction. Although different dsDNA stimuli, time points, and use of a fibroblast cell line versus primary foreskin fibroblasts make it difficult to compare between the two experiments, using QNB and the GLM approaches, we found four peaks in three genes (AKAP8, SUN2, and TMEM140) that showed significant changes with higher interferon (Fig. 4d). Overall, we were unable to detect the same changes in m⁶A_(m) across studies of heat shock or HIV, and we detected only a few common changes in the response to dsDNA. However, we do note that cell line-specific differences in m⁶A_(m) regulation and differences in experimental protocols could account for some of the variability among these studies.

We did not have MeRIP-seq data for two studies from exactly the same conditions and cell lines to compare, but two studies both used cell lines derived from iSLK to study the effects of KSHV on host m⁶A^27,28. Both suggested that KSHV infection could decrease the number of m⁶A sites in host transcripts. Hesser et al. (2018) found that lytic KSHV infection decreased the number of peaks on host transcripts by >25%; Tan et al. (2018) suggested a loss of 17–59% of peaks in two different cell types, but that m⁶A_(m) peak fold enrichment showed better clustering by cell type than by infection status. Neither of these studies discussed specific genes that showed differential methylation with lytic infection. For our comparison of m⁶A_(m) peak changes in these data sets, we identified probable changes in peaks based on statistical significance using QNB or the GLMs with log₂ fold change difference between peaks and genes of ≥1. We detected 80 peak changes in the data from Hesser et al. (2018) and 18 in the data from Tan et al. (2018) but found no peaks that changed in both iSLK data sets with lytic KSHV infection. Applying the same statistical approaches, we were likewise unable to detect any shared peak changes between the studies of HIV infection, and there were insufficient replicates to compare heat shock studies^{16,17,20,25,72}. Thus, in our reanalysis of m⁶A changes in response to stimuli, we detected only four statistically reproducible peak changes, all in response to dsDNA.

Disparities between experiments were not simply due to significance thresholding or differences in peak detection. Taking the union of peaks called in two experiments for KSHV, HIV, and dsDNA treatment, we found minimal to negative correlations in changes in m⁶A enrichment induced by treatment at the same sites, further showing that changes with similar treatments are not reproducible (Supplementary Fig. 7e).

MeRIP-RT-qPCR validation

Although statistical approaches revealed fewer changes in m⁶A_(m) with various stimuli than published estimates, and we were unable to confirm changes in m⁶A_(m) methylation of specific genes across studies of similar conditions, many of the studies we looked at do include additional validation of m⁶A_(m) changes from MeRIP-seq using MeRIP-RT-qPCR. Recently it was shown that MeRIP-RT-qPCR can capture differences in m⁶A:A ratios at specific sites³⁴, but it is unknown how MeRIP-RT-qPCR is affected by changes in gene expression. To test this, we ran MeRIP-RT-qPCR on in vitro transcribed RNA oligonucleotides that lacked or contained m⁶A spiked into total RNA extracted from Huh7 cells (Supplementary Table 6). We found that MeRIP-RT-qPCR detected the direction of change in m⁶A levels at different concentrations of spike-in RNAs (Fig. 5a,b). However, technical variation could also lead to spuriously significant differences. For example, a comparison of m⁶A enrichment between two dilutions (0.1 fmol and 10 fmol) of a 30% methylated spike-in mixture returned a p-value of 0.004 (unpaired Student’s t-test).

We next assessed the correlation between m⁶A enrichment observed using MeRIP-seq and MeRIP-RT-qPCR using data from our recent work that identified 58 peak changes in m⁶A in Huh7 cells following infection by four different viruses⁵⁰. For those experiments, we again selected peaks that change based on results from the union of QNB and the GLM approaches. We found that the magnitude of changes in common among viruses correlated between MeRIP-seq and MeRIP-RT-qPCR, both across peaks (Pearson’s R = 0.57, p = 3.7E-6) and within single peaks across viruses (13 out of 19 peaks showed positive correlations, four of which had p-values <0.05 with three data points) (Fig. 5c, Supplementary Fig. 8). Given the correlation we found between MeRIP-seq and MeRIP-RT-qPCR, it is unclear why changes in IP over input sequencing reads were undetectable at the peaks reported by Bertero et al. (2018) and Huang et al. (2019) but differences in peaks were successfully validated using MeRIP-RT-qPCR^22,69. Based on these discrepancies, while MeRIP-RT-qPCR can be used as an initial method of validation for predicted peak changes, additional methods are necessary to confirm quantitative differences in m⁶A levels and to resolve points where the assays do not agree.

We next used our peaks validated with MeRIP-RT-qPCR to estimate the number of replicates necessary for detection of changes with either the GLM or QNB methods. Using a permutation test, we downsampled infected and uninfected replicates and reran statistical detection of changes. We found that approximately 6–9 replicates were necessary for consistent detection (in at least 50% of subsamples) of most peak changes (Fig. 5d). Schurch et al. (2016) and Conesa et al. (2016) produced similar recommendations for basic RNA-seq studies, finding that 6–12 replicates were necessary to detect most changes in gene expression and that changes of 1.25 were detectable 25% of the time with five replicates, rising to 44% with ten replicates, respectively^36,75. While our findings broadly agree with these recommendations for RNA-seq, they also suggest that almost all published MeRIP-seq studies to date are underpowered.

Discussion

In the eight years since MeRIP-/m⁶A-seq was first published^16,17, many studies have used these methods to examine the function of m⁶A, its distribution along mRNA transcripts, and how it might be regulated under various conditions. While 35 out of 64 of the MeRIP- and miCLIP-seq papers we surveyed (Supplementary Table 1) refer to m⁶A as “dynamic”, and, by contrast, only two describe the modification as “static”, the literature is unclear on what is meant by the word “dynamic”. There is mixed evidence as to whether m⁶A is reversible through demethylation by the proposed demethylases FTO and ALKBH5^70,76,77,78. Recent research using an endoribonuclease-based method for m⁶A detection suggests that ALKBH5 has only a mild suppressive effect on m⁶A levels and FTO no effect⁴⁵. Although m⁶A does not appear to change over the course of an mRNA’s lifetime at steady-state³, whether it changes in response to a particular stimulus and at what point is less clear. Some studies have suggested that m⁶A may be modulated through changes in methyltransferase and demethylase expression, producing consistent directions of change across transcripts^8,23,34, through alternative mechanisms involving microRNA, transcription factors, promoters, or histone marks^{21,22,65,69,79}, or through indeterminate mechanisms^{17,20,25,26,27,28,52}. However, based on our reanalysis of available MeRIP-seq data, there is still only meagre support for widespread changes in m⁶A across the transcriptome independent of changes in the expression of methylation machinery (e.g. increases or decreases in METTL3 expression).

In particular, replication of peaks and changes in peaks across studies is limited. As with other RNA IP-based methods, MeRIP-seq data contains noise, owing to technical and biological variation⁸⁰. In fact, while peak overlaps reach ~80% between replicates of the same study, they decrease to a median of 45% between studies, most of which use 2–3 replicates each (Fig. 1). Given that the detection of peaks is so variable and that peak heights differ among replicates, it is perhaps not surprising that peak changes have yet to be reproduced between multiple studies of similar conditions. Indeed, variability in MeRIP-seq could also mask differences in m⁶A regulation among cell types, which have been described in mouse brains³⁴ and in cell lines exposed to KSHV²⁸. To distinguish biological and technical variation, it will therefore be particularly important to test if multiple groups using the same cell line and conditions can better reproduce changes in m⁶A.

Disparities in the methods used to detect changes in m⁶A_(m) peaks also play a role in differing conclusions among studies. Here, we analyzed four statistical methods to detect changes in peaks and found that three of these methods showed uniform or conservatively shifted p-value distributions and were able to identify changes in m⁶A_(m) independent of changes in gene expression. We therefore suggest that these statistical methods, in combination with filters for input levels in both conditions and the difference in log₂ fold change between peaks and genes, can be used to identify candidate m⁶A_(m) sites from MeRIP-seq data for further analysis and validation (Fig. 6). Based on our results, while MeTDiff works for peak detection, we do not recommend MeTDiff for peak change detection as it does not control well for differences in gene expression (Fig. 3). Similar to others³³, we found that plotting predicted m⁶A changes was invaluable and that appropriate scaling for gene coverage could reveal changes proportional to gene expression. In addition, plotting the standard deviation in transcript coverage can help assess typical variation in peak height among replicates. We note that both differential methylation of a gene and methylation of a gene that is differentially expressed could be important, but they should not be conflated when considering the role of m⁶A in transcript regulation.

The extent to which m⁶A changes on particular transcripts and whether it changes in binary presence/absence or in degree is unclear. MeRIP-RT-qPCR could detect methylation differences in in vitro transcribed RNA. Further, we found that these changes correlated with differences in MeRIP-seq enrichment. However, neither MeRIP-seq nor MeRIP-RT-qPCR can reveal the precise fraction of transcript copies modified by m⁶A. In general, antibody-based methods are subject to biases, including from differences in binding efficiencies based on RNA structure and motif preferences⁸¹. There is an oft-cited but little-used method for quantification of m⁶A, site-specific cleavage and radioactive-labeling followed by ligation-assisted extraction and thin-layer chromatography (SCARLET)¹⁹. However, this method can be challenging, works only for highly abundant transcripts, and is impractical for transcriptome-wide analysis. A recently developed endoribonuclease-based, antibody-independent approach for m⁶A detection is promising in terms of quantification of m⁶A, but its use is limited to a subset of m⁶A sites within DRAC motifs ending in ACA (~16% of all sites)^44,45. So far, comparison to this data suggests that antibody-based approaches may underestimate the number of m⁶A sites⁴⁵. Alternative methods to detect m⁶A based using single-molecule sequencing (including direct RNA sequencing and real-time cDNA synthesis) are under development and may offer ways to detect, quantify, and phase m⁶A sites, but these have not yet been shown to accurately detect m⁶A across a cellular transcriptome^46,82,83. For now, site-specific SCARLET is the only option to biochemically validate proposed changes in m⁶A at most motifs.

Conclusions

Our work reveals the limits of MeRIP-seq reproducibility for the detection of m⁶A_(m) and in particular suggests caution when using MeRIP-seq for the detection of changes in m⁶A_(m). To increase confidence in predicted changes in m⁶A_(m), we propose statistical approaches that account for differences in gene expression between conditions and variability among replicates. These methods can be used to gain insight into the regulation and function of m⁶A_(m) and to predict specific sites for validation before the development of high-throughput alternatives to MeRIP-seq, and similar strategies may be applicable to other types of RNA sequencing assay.

Methods

No data was generated from animals or human participants as part of this study. All such data was previously published.

New MeRIP-seq data

- Huh7 data

Total RNA was extracted from Huh7 cells using Trizol (Thermo-Fisher). mRNA was purified from 200 μg total RNA using the Dynabeads mRNA purification kit (Thermo-Fisher) and concentrated by ethanol precipitation. Purified mRNA was fragmented using the RNA Fragmentation Reagent (Thermo-Fisher) for 15 minutes followed by ethanol precipitation. Then, MeRIP was performed using EpiMark N6-methyladenosine Enrichment kit (NEB). 25 μL Protein G Dynabeads (Thermo-Fisher) per sample were washed three times in MeRIP buffer (150 mM NaCl, 10 mM Tris-HCl, pH 7.5, 0.1% NP-40) and incubated with 1 μL anti-m⁶A antibody (NEB) for 2 hours at 4 °C with rotation. After washing three times, anti-m⁶A conjugated beads were incubated with purified mRNA with rotation at 4 °C overnight in 300 μL MeRIP buffer with 1 μL RNAse inhibitor (recombinant RNasein; Promega). Beads were then washed twice with 500 μL MeRIP buffer, twice with low salt wash buffer (50 mM NaCl, 10 mM Tris-HCl, pH 7.5, 0.1% NP-40), twice with high salt wash buffer (500 mM NaCl, 10 mM Tris-HCl, pH 7.5, 0.1% NP-40), and once again with MeRIP buffer. m⁶A-modified RNA was eluted twice in 100 μL MeRIP buffer containing 5 mM m⁶A salt (Santa Cruz Biotechnology) for 30 minutes at 4 °C with rotation and concentrated by ethanol precipitation. RNA-seq libraries were prepared from eluate and the 10% of RNA set aside as input using the TruSeq mRNA library prep kit (Illumina) and checked for fragment length using the Agilent 2100 Bioanalyzer. Single-end 50 base pair reads were sequenced on an Illumina HiSeq. 2500.

- Heat shock

Early passage OCI-Ly1 diffuse large B-cell lymphoma cells were grown in Iscove’s modified Eagle Medium (IMDM) with 10% fetal bovine serum (FBS). OCI-Ly1 cells were obtained from the Ontario Cancer Institute and regularly tested for Mycoplasma contamination by PCR and identified by single nucleotide polymorphism. Cells were maintained with 1% penicillin/streptomycin in a 37 °C, 5% CO₂, humidified incubator. In these growing conditions, heat shocked cells were exposed to 43 °C for 1 hour, followed by 1 hour of recovery at 37 °C while control cells were maintained at 37 °C. Following treatment, cells were processed at 4 °C to obtain total cell lysates. Lysates were immunoprecipitated for m⁶A_(m) using Synaptic Systems antibody (SYSY 202 003) following the protocol described in Meyer, et al. (2012) and sequenced on an Illumina HiSeq. 2500¹⁶.

Read processing

Reads were trimmed using Trimmomatic⁸⁴ and aligned to the human genome (hg38) or the mouse genome (mm10), as appropriate, using STAR, a splice-aware aligner for RNA-seq data⁷¹. We used the flag “–outFilterMultimapNmax 1” to keep only uniquely aligned reads. Scripts used for alignment are provided with the rest of the analysis scripts at https://github.com/al-mcintyre/merip_reanalysis_scripts.

Peak detection and comparison

IP over input peaks were called using MACS2 callpeak using the parameters “–nomodel–extsize 100 (or, if available, the approximate fragment size for a specific experiment to extend reads at their 3′ ends to a fixed length)–gsize 100e6 (the approximate size of mouse and human transcriptomes based on gencode annotations)”⁴⁹. No filter for coverage was applied at the stage of peak detection. Transcript coverage was estimated using Kallisto⁸⁵ with an index construct 31mers, except for the Schwartz et al. (2014) data set, where the reads were too short and an alternative index based on 29mers was constructed³³. For Fig. 1b, the full union of unique peaks was taken and the percent of that set detected in single replicates calculated. Intersects between peaks that overlapped for transcripts with ≥10X mean coverage in both samples were taken using bedtools⁸⁶ for Fig. 2, allowing a generous minimum of 1 overlapping base. Heatmaps for peak overlaps were generated using the ComplexHeatmap package in R⁸⁷. MeRIP-seq data sets in Fig. 2b included those for human cell lines in Fig. 2a, other data sets from the same studies and any data sets that shared the same cell lines, and other data sets that looked at multiple human cell types. We considered only data sets from baseline conditions in Fig. 2 (untreated cells and knockdown controls).

Poisson and negative binomial fits

Reads aligned to peaks were counted using featureCounts from the Rsubread package⁸⁸. Poisson and negative binomial models were fit to input and IP read counts at peaks using maximum likelihood estimation. Simulated read counts were generated with Poisson or negative binomial distributions based on estimated parameters from the sample, with 500 random generations per model. The log likelihood of seeing read counts from the sample and the simulations given the model parameters was then calculated and the mean taken across all peaks.

Peak change detection and generalized linear models

Generalized linear models to detect changes in IP coverage while controlling for differences in input coverage were implemented based on a method previously applied to HITS-CLIP data⁵⁷. Full and reduced models were constructed as follows:

log μ_ij = β_i⁰ + β_i^IPX_j^IP + β_i^STIMX_j^STIM + β_i^STIM:IPX_j^STIM:IP

log μ_ij = β_i⁰ + β_i^IPX_j^IP + β_i^STIMX_j^STIM

Where μ_ij is the expected read count for peak i in sample j, modelled as a negative binomial distribution, X_j^IP = 1 for IP samples and 0 for input samples, and X_j^STIM = 1 for samples under the experimental intervention and 0 for control samples.

Statistical significance was then assessed using a chi-squared test (df = 1) for the difference in deviances between the full and reduced models, with the null hypothesis that the interaction term (β_i^STIM:IP) for differential antibody enrichment driven by the experimental intervention is zero. The likelihood ratio test was implemented through DESeq. 2⁵⁵ and edgeR⁵⁶, two programs developed for RNA-seq analysis that differ in how they filter data and in how they estimate dispersions for negative binomial distributions. Generalized linear models implemented through edgeR included a term for the normalized library size of sample j.

QNB was run as suggested for experiments with biological replicates, where each IP and input variable (“ip1”, etc.) consisted of a matrix of peak counts for either condition 1 or condition 2:

> qnbtest(ip1, ip2, input1, input2, mode = “per-condition”)

We extracted functions from MeTDiff so that we could supply our own peaks and thus control for differences in peak detection among tools. The main post-peak calling function, diff.call.module, was run as follows using the same count matrices as for QNB:

> diff.call.module(ip1, input1, ip2, input2)

Gene and peak expression changes were estimated as log₂ fold changes from DESeq. 2 based on differences in input read counts aligned to genes and IP read counts aligned to peaks, respectively, and the change in peak relative to gene enrichment was calculated as the absolute difference in log₂ fold change between those values.

Comparison to published studies

The sources for published estimates of m⁶A peak changes included in our comparison are listed in Supplementary Table 4. Significant (FDR-adjusted p < 0.05) peaks were considered for DESeq. 2, edgeR, and QNB, run as described above. We also considered a filtered set of peaks derived from the union of significant peaks from the three tools with additional filters for location within exons, |log₂ fold change between peak IP and gene input | ≥ 1, and a minimum peak read count of 10 across replicates and conditions. We used gProfiler to calculate enrichment of functional categories⁸⁹.

In Fig. 4b-c, we selected Hspa1a/HSPA1A as our representative gene for heat shock because it was the primary example cited by Zhou et al. (2015) and Meyer et al. (2015)^4,20. For HIV, we selected PSIP1 because it was among the 56 genes reported by Lichinchi et al. (2016a)²⁵, it plays a known role in HIV infection, and we detected a peak in the gene using MACS2.

For KSHV, we compared significant results (adjusted p < 0.05) from QNB and GLMs (DESeq. 2 and edgeR), with additional filtering for |peak IP – gene input log₂ fold change | ≥ 1 (lowering this threshold to 0.5 did not change results), for data from Hesser et al. (2018)²⁷ in lytic vs. latent iSLK.219 cells and data from Tan et al. (2018)²⁸ in lytic vs. latent iSLK BAC16 cells. We used the same approach to compare data from Rubio et al. (2018) and Winkler et al. (2019)^73,74 for response to dsDNA. Data sets used for site-specific comparisons are summarized in Supplementary Table 5.

Gene coverage was plotted using CovFuzze (https://github.com/al-mcintyre/CovFuzze), which summarizes mean and standard deviation in coverage across available replicates⁹⁰. Pearson’s correlations were taken for Supplementary Fig. 7 for peaks expressed above a minimum input peak read count of 10 across replicates and conditions.

Spike-in controls and MeRIP-RT-qPCR

In vitro transcribed (IVT) controls were provided by the Jaffrey Lab and consisted of 1001 base long RNA sequences with three adenines in GAC motifs (Supplementary Table 6) either fully methylated or unmethylated. m⁶A and A controls were mixed in various ratios (1:9, 3:7, and 9:1) that approximate the variation in m⁶A levels detected by SCARLET (m⁶A levels at specific sites have been reported to vary from 6–80% of transcripts¹⁹). Modified and unmodified standards were mixed at the indicated ratios to yield a final quantity of 0.1 fmol, 1 fmol, and 10 fmol. Mixed RNA standards were added to 30 μg total RNA from Huh7 cells, along with 0.1 fmol of positive (m⁶A-modified Gaussia luciferase RNA, “GLuc”) and negative control (unmodified Cypridina luciferase, “CLuc”) spike-in RNA provided with the N6-methyladenosine Enrichment kit (EpiMark). Following MeRIP as described above, cDNA was synthesized from eluate and input samples using the iScript cDNA synthesis kit (Bio-Rad), and RT-qPCR was performed on a QuantStudio Flex 6 instrument. Data was analyzed as a percent of input of the spike-in RNA in each condition relative to that of the provided positive control spike-in.

For MeRIP-RT-qPCR to test peak callers, Huh7 cells plated in 6-well plates were transfected with siRNAs against METTL3 and METTL14 (Qiagen; SI04317096 and SI00459942) or non-targeting control siRNa (SI03650318) using Lipofectamine RNAiMax (Thermo Fisher) twice, 24 hours apart. 48 hours following the second round of siRNA transfection, cells were harvested in TRIzol reagent and total RNA was extracted. 30 μg total RNA was fragmented for 3 mins at 75 °C, concentrated by ethanol precipitation, and MeRIP-RT-qPCR was performed as described above. Primers used for RT-qPCR are provided in Supplementary Table 7 and siRNA sequences in Supplementary Table 8.

Cell culture and infection (data used for MeRIP-RT-qPCR comparisons)

Huh7 cells were grown in DMEM (Mediatech) supplemented with 10% fetal bovine serum (HyClone), 2.5 mM HEPES, and 1X non-essential amino acids (Thermo-Fisher). The identity of the Huh7 cell lines was verified using the Promega GenePrint STR kit (DNA Analysis Facility, Duke University), and cells were verified as mycoplasma free by the LookOut Mycoplasma PCR detection kit (Sigma). Infectious stocks of a cell culture-adapted strain of genotype 2 A JFH1 HCV were generated and titered on Huh7.5 cells by focus-forming assay (FFA), as described⁹¹. Dengue virus (DENV2-NGC), West Nile virus (WNV-NY2000), and Zika virus (ZIKV-PRVABC59) viral stocks were generated in C6/36 cells and titered on Vero cells as described⁹¹. All viral infections were performed at a multiplicity of infection of 1 for 48 hours.

Data availability

MeRIP-seq data for the Huh7 negative controls is available in the GEO repository, under accession number GSE130891. MeRIP-seq data for heat shock in B-cell lymphoma is available under accession number GSE130892. Accession numbers for all other data sets reanalyzed in the study are included in Supplementary Tables 1–5. Scripts used for analysis are available at https://github.com/al-mcintyre/merip_reanalysis_scripts and a pipeline implementing generalized linear models through DESeq. 2 and edgeR, as well as QNB, is provided at https://github.com/al-mcintyre/deq. Additional responses to comments on the paper are available at https://www.biorxiv.org/content/biorxiv/early/2020/01/10/657130/DC5/embed/media-5.pdf (permanent link to bioRxiv: https://doi.org/10.1101/657130).

References

Balacco, D. L. & Soller, M. The m6A Writer: Rise of a Machine for Growing Tasks. Biochemistry 58, 363–378 (2018).
Article PubMed CAS Google Scholar
Ke, S. et al. A majority of m6A residues are in the last exons, allowing the potential for 3′ UTR regulation. Genes & development 29, 2037–2053 (2015).
Article CAS Google Scholar
Ke, S. et al. m6A mRNA modifications are deposited in nascent pre-mRNA and are not required for splicing but do specify cytoplasmic turnover. Genes &. Development 31, 990–1006 (2017).
CAS Google Scholar
Meyer, K. D. et al. 5′ UTR m6A promotes cap-independent translation. Cell 163, 999–1010 (2015).
Article CAS PubMed PubMed Central Google Scholar
Wang, X. et al. N6-methyladenosine modulates messenger RNA translation efficiency. Cell 161, 1388–1399 (2015).
Article CAS PubMed PubMed Central Google Scholar
Lin, S., Choe, J., Du, P., Triboulet, R. & Gregory, R. I. The m6A methyltransferase METTL3 promotes translation in human cancer cells. Molecular cell 62, 335–345 (2016).
Article CAS PubMed PubMed Central Google Scholar
Wang, Y. et al. N 6-methyladenosine modification destabilizes developmental regulators in embryonic stem cells. Nature cell biology 16, 191 (2014).
Article CAS PubMed PubMed Central Google Scholar
Molinie, B. et al. m 6 A-LAIC-seq reveals the census and complexity of the m 6 A epitranscriptome. Nature methods 13, 692 (2016).
Article CAS PubMed PubMed Central Google Scholar
Yoon, K.-J. et al. Temporal control of mammalian cortical neurogenesis by m6A methylation. Cell 171, 877–889 (2017).
Article CAS PubMed PubMed Central Google Scholar
Meyer, K. D. & Jaffrey, S. R. Rethinking m6A readers, writers, and erasers. Annual review of cell and developmental biology 33, 319–342 (2017).
Article CAS PubMed PubMed Central Google Scholar
Roundtree, I. A., Evans, M. E., Pan, T. & He, C. Dynamic RNA modifications in gene expression regulation. Cell 169, 1187–1200 (2017).
Article CAS PubMed PubMed Central Google Scholar
Rosa-Mercado, N. A., Withers, J. B. & Steitz, J. A. Settling the m6A debate: methylation of mature mRNA is not dynamic but accelerates turnover. Genes &. development 31, 957–958 (2017).
CAS Google Scholar
Darnell, R. B., Ke, S. & Darnell, J. E. Pre-mRNA processing includes N6 methylation of adenosine residues that are retained in mRNA exons and the fallacy of “RNA epigenetics”. RNA 24, 262–267 (2018).
Article CAS PubMed PubMed Central Google Scholar
Zhao, B. S., Nachtergaele, S., Roundtree, I. A. & He, C. Our views of dynamic N6-methyladenosine RNA methylation. RNA 24, 268–272 (2018).
Article CAS PubMed PubMed Central Google Scholar
Mauer, J. & Jaffrey, S. R. FTO, m6Am, and the hypothesis of reversible epitranscriptomic mRNA modifications. FEBS letters 592, 2012–2022 (2018).
Article CAS PubMed Google Scholar
Meyer, K. D. et al. Comprehensive analysis of mRNA methylation reveals enrichment in 3′ UTRs and near stop codons. Cell 149, 1635–1646 (2012).
Article CAS PubMed PubMed Central Google Scholar
Dominissini, D. et al. Topology of the human and mouse m 6 A RNA methylomes revealed by m 6 A-seq. Nature 485, 201 (2012).
Article ADS CAS PubMed Google Scholar
Linder, B. et al. Single-nucleotide-resolution mapping of m6A and m6Am throughout the transcriptome. Nature methods 12, 767 (2015).
Article CAS PubMed PubMed Central Google Scholar
Liu, N. et al. Probing N6-methyladenosine RNA modification status at single nucleotide resolution in mRNA and long noncoding RNA. RNA 19, 1848–1856 (2013).
Article CAS PubMed PubMed Central Google Scholar
Zhou, J. et al. Dynamic m 6 A mRNA methylation directs translational control of heat shock response. Nature 526, 591 (2015).
Article ADS CAS PubMed PubMed Central Google Scholar
Chen, T. et al. m6A RNA Methylation Is Regulated by MicroRNAs and Promotes Reprogramming to Pluripotency. Cell Stem Cell 16, 289–301 (2015).
Article CAS PubMed Google Scholar
Bertero, A. et al. The SMAD2/3 interactome reveals that TGFβ controls m 6 A mRNA methylation in pluripotency. Nature 555, 256 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Liu, J. et al. m 6 A mRNA methylation regulates AKT activity to promote the proliferation and tumorigenicity of endometrial cancer. Nature cell biology 20, 1074 (2018).
Article CAS PubMed PubMed Central Google Scholar
Anders, M. et al. Dynamic m6A methylation facilitates mRNA triaging to stress granules. Life science alliance 1, e201800113 (2018).
Article PubMed PubMed Central Google Scholar
Lichinchi, G. et al. Dynamics of the human and viral m6A RNA methylomes during HIV-1 infection of T cells. Nature. Microbiology 1, 1–9 (2016).
Google Scholar
Lichinchi, G. et al. Dynamics of human and viral RNA methylation during Zika virus infection. Cell host & microbe 20, 666–673 (2016).
Article CAS Google Scholar
Hesser, C. R., Karijolich, J., Dominissini, D., He, C. & Glaunsinger, B. A. N6-methyladenosine modification and the YTHDF2 reader protein play cell type specific roles in lytic viral gene expression during Kaposi’s sarcoma-associated herpesvirus infection. PLoS pathogens 14, e1006995 (2018).
Article PubMed PubMed Central CAS Google Scholar
Tan, B. et al. Viral and cellular N 6-methyladenosine and N 6, 2′-O-dimethyladenosine epitranscriptomes in the KSHV life cycle. Nature microbiology 3, 108 (2018).
Article CAS PubMed Google Scholar
Vu, L. P. et al. The N 6-methyladenosine (m 6 A)-forming enzyme METTL3 controls myeloid differentiation of normal hematopoietic and leukemia cells. Nature medicine 23, 1369 (2017).
Article CAS PubMed PubMed Central Google Scholar
Liu, L., Zhang, S.-W., Huang, Y. & Meng, J. QNB: differential RNA methylation analysis for count-based small-sample sequencing data with a quad-negative binomial model. BMC bioinformatics 18, 387 (2017).
Article PubMed PubMed Central CAS Google Scholar
Cui, X. et al. MeTDiff: a novel differential RNA methylation analysis for MeRIP-seq data. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) 15, 526–534 (2018).
Article CAS Google Scholar
Zhao, X. et al. FTO-dependent demethylation of N6-methyladenosine regulates mRNA splicing and is required for adipogenesis. Cell research 24, 1403 (2014).
Article CAS PubMed PubMed Central Google Scholar
Schwartz, S. et al. Perturbation of m6A writers reveals two distinct classes of mRNA methylation at internal and 5′ sites. Cell reports 8, 284–296 (2014).
Article CAS PubMed Google Scholar
Engel, M. et al. The role of m6A/m-RNA methylation in stress response regulation. Neuron 99, 389–403 (2018).
Article CAS PubMed PubMed Central Google Scholar
Zhou, J., White, K. P. & Liu, Y. RNA-seq differential expression studies: more sequence or more replication? Bioinformatics 30, 301–304 (2013).
PubMed PubMed Central Google Scholar
Schurch, N. J. et al. How many biological replicates are needed in an RNA-seq experiment and which differential expression tool should you use? Rna 22, 839–851 (2016).
Article CAS PubMed PubMed Central Google Scholar
Su, Z. et al. A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium. Nature biotechnology 32, 903 (2014).
Article CAS Google Scholar
Geula, S. et al. m6A mRNA methylation facilitates resolution of naïve pluripotency toward differentiation. Science 347, 1002–1006 (2015).
Article ADS CAS PubMed Google Scholar
Cui, Q. et al. m6A RNA methylation regulates the self-renewal and tumorigenesis of glioblastoma stem cells. Cell reports 18, 2622–2634 (2017).
Article CAS PubMed Google Scholar
Ma, C. et al. RNA m 6 A methylation participates in regulation of postnatal development of the mouse cerebellum. Genome biology 19, 68 (2018).
Article PubMed PubMed Central CAS Google Scholar
Su, R. et al. R-2HG exhibits anti-tumor activity by targeting FTO/m6A/MYC/CEBPA signaling. Cell 172, 90–105 (2018).
Article CAS PubMed Google Scholar
Zeng, Y. et al. Refined RIP-seq protocol for epitranscriptome analysis with low input materials. PLoS biology 16, e2006092 (2018).
Article PubMed PubMed Central CAS Google Scholar
Zhou, J. et al. N6-methyladenosine guides mRNA alternative translation during integrated stress response. Molecular cell 69, 636–647 (2018).
Article CAS PubMed PubMed Central Google Scholar
Zhang, Z. et al. Single-base mapping of m6A by an antibody-independent method. bioRxiv 575555 (2019).
Garcia-Campos, M. A. et al. Deciphering the “m6A Code” via Antibody-Independent Quantitative Profiling. Cell 178, 731–747.e16 (2019).
Article CAS PubMed Google Scholar
Liu, H. et al. Accurate detection of m6A RNA modifications in native RNA sequences. Nature Communications 10, 4079 (2019).
Article ADS PubMed PubMed Central CAS Google Scholar
Meng, J. et al. A protocol for RNA methylation differential analysis with MeRIP-Seq data and exomePeak R/Bioconductor package. Methods 69, 274–281 (2014).
Article CAS PubMed PubMed Central Google Scholar
Cui, X., Meng, J., Zhang, S., Chen, Y. & Huang, Y. A novel algorithm for calling mRNA m 6 A peaks by modeling biological variances in MeRIP-seq data. Bioinformatics 32, i378–i385 (2016).
Article CAS PubMed PubMed Central Google Scholar
Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome biology 9, R137 (2008).
Article PubMed PubMed Central CAS Google Scholar
Gokhale, N. S. et al. Altered m6A Modification of Specific Cellular Transcripts Affects Flaviviridae Infection. Molecular Cell 77, 542–555 (2020).
Article PubMed CAS Google Scholar
Antanaviciute, A. et al. m6aViewer: software for the detection, analysis, and visualization of N6-methyladenosine peaks from m6A-seq/ME-RIP sequencing data. RNA 23, 1493–1501 (2017).
Article CAS PubMed PubMed Central Google Scholar
He, S. et al. mRNA N6-methyladenosine methylation of postnatal liver development in pig. PloS one 12, e0173421 (2017).
Article PubMed PubMed Central CAS Google Scholar
Tao, X. et al. Transcriptome-wide N 6-methyladenosine methylome profiling of porcine muscle and adipose tissues reveals a potential mechanism for transcriptional regulation and differential methylation pattern. BMC genomics 18, 336 (2017).
Article PubMed PubMed Central CAS Google Scholar
Xiao, S. et al. The RNA N6-methyladenosine modification landscape of human fetal tissues. Nature Cell Biology 21, 651–661 (2019).
Article CAS PubMed Google Scholar
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq. 2. Genome biology 15, 550 (2014).
Article PubMed PubMed Central CAS Google Scholar
Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).
Article CAS PubMed Google Scholar
Park, S.-M. et al. Musashi-2 controls cell fate, lineage bias, and TGF-β signaling in HSCs. Journal of Experimental Medicine 211, 71–87 (2014).
Article CAS Google Scholar
Marioni, J. C., Mason, C. E., Mane, S. M., Stephens, M. & Gilad, Y. RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome research 18, 1509–1517 (2008).
Article CAS PubMed PubMed Central Google Scholar
Batista, P. J. et al. m6A RNA modification controls cell fate transition in mammalian embryonic stem cells. Cell stem cell 15, 707–719 (2014).
Article CAS PubMed PubMed Central Google Scholar
Fustin, J.-M. et al. RNA-methylation-dependent RNA processing controls the speed of the circadian clock. Cell 155, 793–806 (2013).
Article CAS PubMed Google Scholar
Hess, M. E. et al. The fat mass and obesity associated gene (Fto) regulates activity of the dopaminergic midbrain circuitry. Nature neuroscience 16, 1042 (2013).
Article CAS PubMed Google Scholar
Li, Z. et al. FTO plays an oncogenic role in acute myeloid leukemia as a N6-methyladenosine RNA demethylase. Cancer cell 31, 127–141 (2017).
Article PubMed CAS Google Scholar
Huang, H. et al. Recognition of RNA N 6-methyladenosine by IGF2BP proteins enhances mRNA stability and translation. Nature cell biology 20, 285 (2018).
Article CAS PubMed PubMed Central Google Scholar
Zhong, X. et al. Circadian Clock Regulation of Hepatic Lipid Metabolism by Modulation of m6A mRNA Methylation. Cell reports 25, 1816–1828 (2018).
Article CAS PubMed Google Scholar
Barbieri, I. et al. Promoter-bound METTL3 maintains myeloid leukaemia by m 6 A-dependent translation control. Nature 552, 126 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
SEQC/MAQC-III Consortium. et al. A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium. Nature Biotechnology 32, 903 (2014).
Article CAS Google Scholar
Li, M. et al. Ythdf2-mediated m 6 A mRNA clearance modulates neural development in mice. Genome biology 19, 69 (2018).
Article PubMed PubMed Central CAS Google Scholar
Wen, J. et al. Zc3h13 regulates nuclear RNA m 6 A methylation and mouse embryonic stem cell self-renewal. Molecular cell 69, 1028–1038 (2018).
Article CAS PubMed PubMed Central Google Scholar
Huang, H. et al. Histone H3 trimethylation at lysine 36 guides m 6 A RNA modification co-transcriptionally. Nature 567, 414–419 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Zheng, G. et al. ALKBH5 is a mammalian RNA demethylase that impacts RNA metabolism and mouse fertility. Molecular cell 49, 18–29 (2013).
Article CAS PubMed Google Scholar
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Article CAS PubMed Google Scholar
Tirumuru, N. et al. N6-methyladenosine of HIV-1 RNA regulates viral infection and HIV-1 Gag protein expression. Elife 5, e15528 (2016).
Article PubMed PubMed Central Google Scholar
Rubio, R. M., Depledge, D. P., Bianco, C., Thompson, L. & Mohr, I. RNA m6 A modification enzymes shape innate responses to DNA by regulating interferon β. Genes & development 32, 1472–1484 (2018).
Article CAS Google Scholar
Winkler, R. et al. m 6 A modification controls the innate immune response to infection by targeting type I interferons. Nature immunology 20, 173 (2019).
Article CAS PubMed Google Scholar
Conesa, A. et al. A survey of best practices for RNA-seq data analysis. Genome Biology 17, 13 (2016).
Article PubMed PubMed Central CAS Google Scholar
Jia, G. et al. N6-methyladenosine in nuclear RNA is a major substrate of the obesity-associated FTO. Nature chemical biology 7, 885 (2011).
Article CAS PubMed PubMed Central Google Scholar
Wei, J. et al. Differential m6A, m6Am, and m1A demethylation mediated by FTO in the cell nucleus and cytoplasm. Molecular cell 71, 973–985 (2018).
Article CAS PubMed PubMed Central Google Scholar
Mauer, J. et al. Reversible methylation of m 6 A m in the 5′ cap controls mRNA stability. Nature 541, 371 (2017).
Article ADS CAS PubMed Google Scholar
Aguilo, F. et al. Coordination of m6A mRNA methylation and gene transcription by ZFP217 regulates pluripotency and reprogramming. Cell stem cell 17, 689–704 (2015).
Article CAS PubMed PubMed Central Google Scholar
Chakrabarti, A. M., Haberman, N., Praznik, A., Luscombe, N. M. & Ule, J. Data science issues in studying Protein–RNA interactions with CLIP technologies. Annual Review of Biomedical Data Science 1, 235–261 (2018).
Article Google Scholar
Liu, B. et al. A potentially abundant junctional RNA motif stabilized by m6A and Mg2+. Nature Communications 9, 2761 (2018).
Article ADS PubMed PubMed Central CAS Google Scholar
Saletore, Y. et al. The birth of the Epitranscriptome: deciphering the function of RNA modifications. Genome Biology 13, 175 (2012).
Article CAS PubMed PubMed Central Google Scholar
Garalde, D. R. et al. Highly parallel direct RNA sequencing on an array of nanopores. Nature Methods 15, 201 (2018).
Article CAS PubMed Google Scholar
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Article CAS PubMed PubMed Central Google Scholar
Bray, N. L., Pimentel, H., Melsted, P. & Pachter, L. Near-optimal probabilistic RNA-seq quantification. Nature biotechnology 34, 525 (2016).
Article CAS PubMed Google Scholar
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Article CAS PubMed PubMed Central Google Scholar
Gu, Z., Eils, R. & Schlesner, M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics 32, 2847–2849 (2016).
Article CAS PubMed Google Scholar
Liao, Y., Smyth, G. K. & Shi, W. The R package Rsubread is easier, faster, cheaper and better for alignment and quantification of RNA sequencing reads. Nucleic Acids Research 47, e47–e47 (2019).
Article CAS PubMed PubMed Central Google Scholar
Reimand, J. et al. g: Profiler—a web server for functional interpretation of gene lists (2016 update). Nucleic acids research 44, W83–W89 (2016).
Article CAS PubMed PubMed Central Google Scholar
Imam, H. et al. N6-methyladenosine modification of hepatitis B virus RNA differentially regulates the viral life cycle. Proceedings of the National Academy of Sciences 115, 8829–8834 (2018).
Article CAS Google Scholar
Gokhale, N. S. et al. N6-methyladenosine in Flaviviridae viral RNA genomes regulates infection. Cell host & microbe 20, 654–665 (2016).
Article CAS Google Scholar

Download references

Acknowledgements

We would like to thank Christina Leslie and Jonathan Victor for statistical advice, Aashiq Mirza for the MeRIP-RT-qPCR controls, and Helen Lazear for WNV infection. We would also like to thank the Epigenomics Core at Weill Cornell for preparing MeRIP-seq libraries, the Scientific Computing Unit (SCU), and New England Biolabs for donating anti-m⁶A antibodies. We are grateful for funding from the Starr Cancer Consortium (I9-A9-071), the Bert L and N. Kuggie Vallee Foundation, the WorldQuant Foundation, The Pershing Square Sohn Cancer Research Alliance, NASA (NNX14AH50G), the National Institutes of Health (R01AI125416, R21AI129851, R01MH117406), the Leukemia and Lymphoma Society (LLS) grants (LLS 9238-16, Mak, LLS-MCL-982, Chen-Kiang), the Burroughs Wellcome Fund (S.M.H.), the Natural Sciences and Engineering Research Council of Canada (A.B.R.M. PGS-D funding), and the American Heart Association (N.S.G. Pre-doctoral Fellowship, 17PRE33670017).

Author information

Authors and Affiliations

Department of Physiology and Biophysics, Weill Cornell Medicine, New York City, NY, 10065, USA
Alexa B. R. McIntyre & Christopher E. Mason
Tri-Institutional Program in Computational Biology and Medicine, New York City, NY, 10065, USA
Alexa B. R. McIntyre
Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC, 27710, USA
Nandan S. Gokhale & Stacy M. Horner
Division of Hematology and Medical Oncology, Weill Cornell Medicine, New York City, NY, 10065, USA
Leandro Cerchietti
Department of Pharmacology, Weill Cornell Medicine, New York City, NY, 10065, USA
Samie R. Jaffrey
Department of Medicine, Duke University Medical Center, Durham, NC, 27710, USA
Stacy M. Horner
The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, 10021, USA
Christopher E. Mason
The Feil Family Brain and Mind Research Institute, Weill Cornell Medicine, New York, NY, 10065, USA
Christopher E. Mason
The WorldQuant Initiative for Quantitative Prediction, Weill Cornell Medicine, New York, NY, 10021, USA
Christopher E. Mason

Authors

Alexa B. R. McIntyre
View author publications
You can also search for this author in PubMed Google Scholar
Nandan S. Gokhale
View author publications
You can also search for this author in PubMed Google Scholar
Leandro Cerchietti
View author publications
You can also search for this author in PubMed Google Scholar
Samie R. Jaffrey
View author publications
You can also search for this author in PubMed Google Scholar
Stacy M. Horner
View author publications
You can also search for this author in PubMed Google Scholar
Christopher E. Mason
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

A.B.R.M. and C.E.M. conceived the study. A.B.R.M. developed and ran the analyses and wrote the manuscript with N.S.G. and S.M.H. S.R.J. provided in vitro controls for MeRIP-RT-qPCR. N.S.G. prepared MeRIP-seq libraries for the Huh7 controls and ran MeRIP-RT-qPCR tests. L.C. contributed additional heat shock data. All authors read and edited the manuscript.

Corresponding authors

Correspondence to Alexa B. R. McIntyre, Stacy M. Horner or Christopher E. Mason.

Ethics declarations

Competing interests

C.E.M. is a cofounder and board member for Biotia and Onegevity Health, as well as an advisor or compensated speaker for Abbvie, Acuamark Diagnostics, ArcBio, BioRad, DNA Genotek, Genialis, Genpro, Karius, Illumina, New England Biolabs, QIAGEN, Whole Biome, and Zymo Research. The other authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information.

Supplementary information 2.

Supplementary information 3.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

McIntyre, A.B.R., Gokhale, N.S., Cerchietti, L. et al. Limits in the detection of m⁶A changes using MeRIP/m⁶A-seq. Sci Rep 10, 6590 (2020). https://doi.org/10.1038/s41598-020-63355-3

Download citation

Received: 08 March 2020
Accepted: 19 March 2020
Published: 20 April 2020
DOI: https://doi.org/10.1038/s41598-020-63355-3

This article is cited by

m6A-TCPred: a web server to predict tissue-conserved human m6A sites using machine learning approach
- Gang Tu
- Xuan Wang
- Bowen Song
BMC Bioinformatics (2024)
Dissecting the sequence and structural determinants guiding m6A deposition and evolution via inter- and intra-species hybrids
- Ran Shachar
- David Dierks
- Schraga Schwartz
Genome Biology (2024)
N6-methyladenosine modification is not a general trait of viral RNA genomes
- Belinda Baquero-Pérez
- Ivaylo D. Yonchev
- Juana Díez
Nature Communications (2024)
Role of the RNA-binding protein ZC3H41 in the regulation of ribosomal protein messenger RNAs in trypanosomes
- Gloria Ceballos-Pérez
- Miriam Rico-Jiménez
- Antonio M. Estévez
Parasites & Vectors (2023)
Co-effects of m6A and chromatin accessibility dynamics in the regulation of cardiomyocyte differentiation
- Xue-Hong Liu
- Zhun Liu
- Guan-Zheng Luo
Epigenetics & Chromatin (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.