Abstract
Background & Aims The course of Crohn's disease (CD) is heterogeneous, confounding effective therapy. An analysis of differences in colonic gene expression between patients with vs without CD revealed 2 subsets of patients—a group characterized by genes more highly expressed in the colon (colonlike CD) and a group with increased expression of ileum marker genes (ileum-like CD). We compared differences in microRNAs between these groups.
Methods We performed genome-wide microRNA profile analyses of colon tissues from 18 adults with CD and 12 adults without CD (controls). We performed principal component analyses to associate levels of microRNAs with CD subtypes. Colonic epithelial cells and lamina propria immune cells were isolated from intestinal tissues and levels of microRNA 31 (MIR31 or miR-31) were measured by real-time quantitative PCR. We validated the differential expression of miR-31 between the subtypes by measuring miR-31 levels in an independent cohort of 32 adult patients with CD and 23 controls. We generated epithelial colonoid cultures from controls and patients with CD, and measured levels of miR-31 in crypts. We performed genome-wide microRNA profile analyses of formalin-fixed paraffin-embedded colon and ileum biopsies from 76 treatment-naïve pediatric patients with CD and 51 controls and collected data on disease features and outcomes.
Results In comparing miRNA expression profiles between 9 patients with colon-like CD and 9 patients with ileum-like CD, we identified 19 miRNAs with significant differences in levels. We observed a 13.5-fold difference in level of miR-31-5p between tissues from patients with colon-like vs ileum-like CD (Padj = 1.43 x 10-18). Principal component analysis found miR-31 to be the top contributor to the variance observed. Levels of miR-31 were increased 60-fold in tissues from patients with ileum-like CD compared with controls (Padj = 2.59 × 10-51). We validated the differential expression of miR-31 between the subtypes in the independent set of tissues. Colonoids derived from patients with CD had significantly higher levels of miR-31 than colonoids derived from control tissues (day 2 P=.041 and day 6 P=.0095). Levels of miR-31 were significantly increased in colon tissues from pediatric patients with CD compared with controls (~7.8-fold, P=4.64 ×10-7) and in ileum tissues from patients with CD vs controls (~1.5-fold, P=9.97 × 10-7). A low level of miR-31 in index biopsies from pediatric patients with only inflammation and no other complications at time of diagnosis associated with development of fibrostenotic ileal CD.
Conclusions We identified differences in miR-31 levels in colon tissues from adult and pediatric patients with CD compared with controls, and in patients with ileum-like CD compared with colon-like CD. Further studies are needed to determine the mechanisms by which miR-31 might contribute to pathogenesis of this subtype of CD, or affect response to therapy.
- Abbreviations used in this paper
- miRNA
- microRNA
- qRT-PCR
- quantitative reverse transcriptase PCR
- IEC
- Intestinal Epithelial Cell
- RPMMM
- reads per million mapped to microRNAs
- FFPE
- formalin-fixed paraffin embedded
Author Contributions: BPK acquired, analyzed and interpreted data, prepared figures, drafted and revised the manuscript. JBB acquired, analyzed and interpreted data and revised the manuscript. NK acquired, analyzed and interpreted data; GRG, MSS, MH, SSS, OKT, PAC, TT, and NA acquired data; and WAP and MK analyzed data. NDS, EAB, NS, TSS, MJK, DGT and FS provided help with tissue acquisition and patient phenotyping. TSF & PS designed the study, analyzed and interpreted the data, drafted and revised the manuscript, and obtained funding. SZS conceptualized and designed the study, acquired the data, interpreted data, drafted and revised the manuscript, obtained funding, acted as study sponsor, and supervised the study. All authors uphold the integrity of the work, have had final approval of the manuscript in its entirety, and are accountable for all aspects of the work.
Introduction
Crohn’s disease (CD), one of the primary inflammatory bowel diseases (IBD), is a chronic inflammatory condition of the gastrointestinal tract resulting from an aberrant immune response to the enteric microbiota in a genetically susceptible host. CD is highly heterogeneous in disease location, behavior, and progression. Using gene expression and chromatin accessibility profiles in colon tissue, we previously identified two molecular subtypes in adult CD associated with unique phenotypes1. Recent studies validate the premise that specific genetic and molecular profiles are associated with, and may contribute to, disease heterogeneity and behavior. Over 200 genetic loci have been significantly associated with CD risk2. A study of 29,838 adult individuals did not identify DNA variants predictive of CD behavior over time, but did associate genetic variants in IBD with disease location3. Notably, a longitudinal inception cohort study of treatment-naïve pediatric CD patients revealed lipid metabolism and extracellular matrix gene expression signatures in the ileum as predictive of response to steroids and fibrostenotic ileal CD, respectively4,5. However, a more complete set of robust prognostic determinants for CD phenotypes, especially incorporating non-coding RNAs, is still lacking. As such, there remains active, substantive interest in the CD research community to identify specific genetic and molecular factors that mark disease subtypes, and more importantly, inform on disease progression and outcome.
Distinct disease outcomes of CD are likely due in large part to variability in cellular processes that underlie the natural history of CD. Disruption of the intestinal epithelial barrier and loss of tolerance by immune cells to the enteric microbiota are critical cellular events that lead to chronic inflammation seen in CD. Precise cell type-specific mechanisms leading to these dysfunctions are poorly understood. Recently, microRNAs (miRNAs) that confer post-transcriptional regulation of gene expression have emerged as key modulators of intestinal epithelial cell (IEC) biology6,7 and of pathways that underlie the pathogenesis of CD8,9. Mice deficient for miRNAs in the intestinal epithelium exhibit altered intestinal architecture and increased barrier permeability6, which leads to immune cell infiltration and severe intestinal inflammation.
In this study, we identified miRNA-31 (miR-31) as the primary contributor to our previously identified two major molecular subtypes of adult CD patients. We determined that the dramatic upregulation of miR-31 in colonic tissue of CD patients is driven in large part by increased expression specifically in IECs.
Importantly, we expanded our study to incorporate a large cohort of 234 formalin-fixed paraffin embedded (FFPE) index biopsies of colon and ileum tissue from 127 treatment-naïve pediatric patients and non-IBD (NIBD) controls. We found that a high level of colonic miR-31 expression in index biopsies is strongly associated with the presence of rectal inflammation, while lower, more typical colonic miR-31 expression in index biopsies is associated with progression to fibrostenotic ileal disease. Our study shows that miR-31 is a candidate prognostic determinant of CD behavior and highlights the potential role of miR-31 in the pathobiology of CD.
Materials and Methods
Patient populations and phenotyping
Adult and pediatric patients with CD and NIBD related illnesses diagnosed at The University of North Carolina hospitals (UNC) were included in this study. Both the adult and pediatric sections of this study received Institutional Review Board approval at UNC (protocol 10-0355 and 15-0024). Clinical phenotypes considered in this study include demographic and clinical variables such as age, sex, disease duration, age at diagnosis, age at sample acquisition, disease location, and disease behavior. Summarized (Supplementary Table 1) and detailed information of patient demographics and phenotypes for the adult (Supplementary Table 2) and pediatric (Supplementary Table 3) cohorts are provided. This study was not blinded, and all authors had access to the study data and reviewed and approved the final manuscript.
Tissue isolation and characterization
For our adult cohort, all CD and NIBD mucosal biopsies were obtained from macroscopically unaffected sections of the ascending colon at the time of surgery and flash-frozen. No samples showed signs of active microscopic inflammation or disease, as confirmed by an independent pathologist (DGT). Treatment-naïve pediatric patients were diagnosed at UNC. From formalin-fixed, paraffin-embedded (FFPE) tissue, mucosal sections from both macroscopically and microscopically non-inflamed sections of the ascending colon and terminal ileum from the time of initial diagnosis (index biopsy) were identified by a pathologist (DGT), and scrolls were obtained for small RNA isolation. Absence of acute (active) inflammation, including neutrophilic inflammation of crypt epithelium and crypt abscess formation, and chronic inflammation, including architectural distortion and basal lymphoplasmacytosis of the lamina propria, was determined after review of each H&E stained slide (Supplementary Figure 1).
RNA isolation, sequencing, and analysis
RNA was isolated from flash-frozen adult samples from surgical resections using the Qiagen RNeasy Mini Kit (Valencia, CA) following the manufacturer’s protocol. This kit uses column-based DNase treatment to eliminate DNA contamination, and allows the miRNA and mRNA content to be preserved. miRNA was enriched from FFPE tissue for pediatric samples using the Roche High Pure miRNA Isolation Kit (Penzberg, Germany). RNA purity and integrity were assessed with Thermo Scientific NanoDrop 2000 (Waltham, MA) and Agilent 2100 Bioanalyzer (Santa Clara, CA), respectively. For all clinical categories of flash frozen adult samples, we observed average RNA integrity (RIN) values above 7.
RNA-seq libraries were prepared using the Illumina TruSeq polyA+ Sample Prep Kit. Paired-end (50 bp) sequencing was performed on the Illumina HiSeq 2500 platform (GEO accession GSE85499). Reads were aligned to the GRCh38 genome assembly using STAR10 with default parameters. Transcript expression was quantified with Salmon11 using default parameters. Post-alignment normalization and differential analysis was performed using DESeq212 with GENCODE_V25 gene annotations requiring base mean expression >10 and an FDR <0.05.
Small RNA libraries were generated using Illumina TruSeq Small RNA Sample Preparation Kit (San Diego, CA). Single-end (50 bp) sequencing was performed on the Illumina HiSeq 2500 platform (GEO accession GSE101819). miRquant 2.013 was used for miRNA annotation and quantification. Samples with less than 3 million reads mapping to miRNAs were excluded. Differential analysis was performed using DESeq212.
PCA was performed using the prcomp function in R on DESeq2 normalized VST transformed counts for mRNAs (“protein_coding” in GENCODE_V25) and lncRNAs (“lincRNA” or “antisense” in GENCODE_V25) with an expression base mean > 10. For miRNA expression data, PCA was performed using reads per million miRNAs mapped (RPMMM) normalized log2 transformed counts for the 100 miRNAs with the highest standard deviation values across all samples and a normalized expression level of 500 RPMMM across at least 20% of samples. For pediatric samples, we eliminated 18 miRNAs not found in the adult samples to remove potential artifacts due to FFPE preservation. Candidate master regulator miRNAs were detected using miRHub14,15, using “non-network” mode and requiring a predicted target site to be conserved between human and at least two other species.
Quantitative reverse transcriptase PCR
For miR-31, total RNA was isolated from tissues using Norgen’s Total RNA Purification Kit (Thorold, ON, Canada). 50ng of RNA was used for reverse transcription with the Life Technologies TaqMan MicroRNA Reverse Transcription Kit (Grand Island, NY). MiRNA qRT-PCR were performed using the TaqMan
Universal PCR Master Mix per Life Technologies’ protocol, on Bio-Rad Laboratories CFX96 Touch Real Time PCR Detection System (Richmond, CA). Reactions were performed in triplicate using RNU48 as the normalizer. For APOA1 and CEACAM7, total RNA was isolated as described above. cDNA was derived from 1μg RNA by reverse transcriptase using the BioRad iScript cDNA Synthesis kit. RT-qPCR was then performed on these cDNA samples using the BioLine Hi-ROX SYBR kit.
LPMCs and IECs were isolated from intestinal specimens using modifications of previously described techniques16. LPMCs were isolated from human colon by an enzymatic method, followed by Percoll (GE Healthcare, Piscataway, NJ) density-gradient centrifugation. LPMCs were further separated into CD33+14+ peripheral macrophages, CD33+CD14-intestinal resident macrophage, CD20+ B cells, and CD3+ T cells corresponding antibody labeled microbeads (Miltenyi Biotec, Auburn, CA). IECs were isolated from human colon mucosa using Ethylenediaminetetraacetic acid (EDTA) followed by magnetic bead sorting via CD326 labeled microbeads. Purity was >90% by flow cytometric analysis (Supplementary Figure 2)
Colonoid generation and analysis
Epithelial colonoid cultures were generated from non-inflamed regions of colon tissue from NIBD controls and CD patients. The intestinal tissues were washed and mucosectomy performed with surgical scissors. Minced colonic mucosal fragments were incubated at 37°C in 5 ml of digestion media (1 mg/ml collagenase VIII in Advanced Dulbecco’s modified Eagle medium/F12 (ADF), 10% FBS, 15mM HEPES buffer, penicillin/streptomycin, 2mM Glutamax, 100ug/ml Primocin (Invivogen, antibiotic/antimitotic), 10uM Y-27632) for 30 minutes with mechanical disruption. The digested tissue/crypts were centrifuged at 200g for 5 minutes to separate crypts from single cells. Pelleted colonic crypts were resuspended in 5 ml of digestion media and centrifuged again at 200g for 5 minutes. Volume of crypts needed for 40-50 crypts per 96-well well was centrifuged in 1.5 mL tubes at 2500 RPM for 5 minutes. Crypts were embedded in appropriate volume of Growth Factor Reduced Matrigel (Corning) on ice and seeded at 10uL per 96-well. Basal stem culture medium (50% WNT3a conditioned media, 50% R-spondin 2 conditioned media, supplemented with 1 mM HEPES, 2mM Glutamax, 1X N2, 1X B27, and 1 mM N-acetylcysteine, 100ug/ml Primocin, with growth factors 50ng/mL murine EGF, 100ng/mL murine noggin, 1 ug/mL gastrin, 0.01uM PGE2, 10mM nicotinamide, and small molecule inhibitors 500 nM LY2157299, 10 uM SB202190) with 10 uM Y-27632 was added at 100uL per well. At selected timepoints, colonoids embedded in matrigel were lifted from wells with cold ADF. For miRNA analysis, day 2 and 6 reverse transcription and quantitative real time PCR for miR-31 and RNU-48 (housekeeping) were performed using predesigned TaqMan miRNA assays (Life Technologies). The relative expression was calculated by the comparative CT method and normalized to the expression of RNU-48.
Results
Cluster adult CD patients based on microRNA and lncRNA expression profiles
Previously, we demonstrated that medically refractory Crohn’s disease (CD) patients undergoing surgery clustered into two distinct groups using principal component analysis (PCA) of mRNA expression by RNA-seq on uninflamed colonic mucosa from 21 adult patients with CD and 11 adult control patients (NIBD)1. Analysis of genes differentially expressed between these two groups revealed that genes more highly expressed in the colon of one group were enriched for previously identified NIBD colonic marker genes, while genes more highly expressed in the second group were enriched for normal ileum marker genes. We labeled these groups colon-like (CL) and ileum-like (IL). We showed by a prospective analysis that these CL and IL CD subgroups exhibit rectal CD and ileal inflammation, respectively. To evaluate further whether this molecular stratification was evident within non-coding RNAs, we analyzed small RNA-seq data from most of the same CD and NIBD patients (18 CD, 12 NIBD) to quantify the expression of microRNAs (miRNAs), which we previously showed was able to distinguish CD patients from NIBD controls17. We also re-interrogated the RNA-seq data from the same patients to quantify long, non-coding RNAs (lncRNAs). PCA on each of the miRNA and lncRNA datasets (Figure 1A and 1B; Supplementary Tables 4 and 5) revealed that CD samples clustered into the same distinct CL and IL groups as initially defined with the mRNA data, which we also recapitulated in this study using updated gene annotations (Supplementary Figure 3 and Supplementary Table 6). These data demonstrate that the CL and IL CD subtypes are defined by expression profiles of several types of RNA molecules, which perform diverse functions within the cell.
Identify and investigate key miRNA drivers in adult CD subtype stratification
To identify the miRNAs that contribute most to the stratification of the two molecular CD subtypes, we compared miRNA expression profiles between the 9 CL and 9 IL CD patients. We found that 19 miRNAs were significantly differentially expressed between the two groups (|log2(FC)| > 1, FDR < 0.05). Strikingly, we observed a 13.5-fold change in miR-31-5p (miR-31; Padj = 1.43 x 10−18) between CL and IL samples. Analysis of PCA components revealed that miR-31 is the top contributor to the variance observed for principle component (pC)-2 that separates the CL and IL patients (Supplementary Table 4). These findings suggest that miR-31 expression can stratify CD into two major molecular subtypes (Figure 2A).
We and others have identified miR-31 as a discriminant more generally of CD and NIBD patients18,19. We hypothesized that this difference is driven primarily by CD patients in the IL group. To test this hypothesis, we compared the levels of miR-31 in each of IL and Cl groups relative to NIBD. We observed a dramatic and highly significant up-regulation (~60-fold) of miR-31 in IL patients compared with NIBD patients (Padj= 2.59 × 10−51; Figure 2B). We also detected a significant difference in expression between CL and NIBD (~4-fold, Padj = 7.66 × 10−06; Figure 2B); however, the magnitude of the difference is much lower. These findings support the above-stated hypothesis, indicating that while miR-31 is a strong marker of disease presence in all Cd patients, this signal is driven predominantly by those patients of the IL subtype.
Expression levels of mature miRNAs can be altered in several ways, including changes to the rate of transcription, efficacy of the maturation (biogenesis) process, and RNA stability. To determine whether the miR-31 locus is subject to enhanced transcription in IL CD patients, we quantified the normalized density of RNA-seq reads mapping to the primary transcript of miR-31 (MIR31HG) across all samples. We observed that transcription levels of MIR31HG are indeed dramatically elevated in the IL subgroup relative to both the CL subgroup and NIBD patients (Figure 2C). These data indicate that increased level of transcription is one major contributor to the observed difference in miR-31 levels between IL patients and the NIBD and CL patients. Notably, RNA-seq data from the ileum of an NIBD patient (Figure 2C) revealed a signal at the MIR31HG locus that closely resembles the signal from the colon of IL CD patients.
MiRNAs regulate gene expression by binding to recognition elements in the 3’ untranslated regions of target mRNAs and marking the mRNAs for translational repression and degradation20. Therefore, we sought to determine, using our published tool miRhub14, whether genes that are down-regulated in IL relative to NIBD are enriched for predicted target sites of miR-31 or any other miRNA shown to be upregulated in the colon of IL patients. Notably, we found that miR-31 is the only upregulated miRNA whose target genes are significantly enriched among the genes downregulated in IL patients compared to both CL and NIBD patients (empirical P < 0.05; Supplementary Figure 4). This indicates that miR-31 is not only dramatically elevated in the IL subtype of CD, but also a candidate master regulator of genes that are downregulated in that subtype.
To validate the differential expression of miR-31 between the IL and CL subtypes of CD, we measured miR-31 levels in an independent cohort of 32 adult CD and 23 NIBD patients using qRT-PCR. We first recapitulated the finding that miR-31 levels are significantly up-regulated overall in CD relative to NIBD (P = 2.08 x 10−7, 2-tailed unpaired Student’s t test; Figure 3A). As expected, we also found that miR-31 expression levels stratify CD patients into two subgroups, "high” and "low”, which we hypothesized reflect the IL and CL molecular subtypes, respectively. To test this hypothesis, we measured mRNA levels of APOA1 (Fig 3B), a marker gene in ileum, and CEACAM7 (Figure 3C), a marker gene in colon, both of which we previously showed can stratify IL and CL patients21,22. We found that the patients with high colonic miR-31 expression also show high APOA1 expression and low CEACAM7 expression, and we observed the opposite trend for the patients with low colonic miR-31 expression. Altogether, these data confirm the finding that miR-31 expression levels stratify CD patients into two molecular subgroups. It is important to note that while APOA1 and CEACAM7 are valuable in stratifying CD patients into the two main subtypes, their expression level in patients from the CL subgroup are similar to that of NIBD individuals, unlike miR-31 for which we observe significant differential expression even between CL and NIBD. This indicates that miR-31 uniquely distinguishes both presence of CD and molecular subtypes of CD.
Measure miR-31 expression in distinct colonic cell types and crypt derived colonoids established from adult patients with CD
Colon tissue is composed of several distinct cell types, and expression studies in tissue do not reveal from which particular cells transcripts originated. To measure miR-31 expression in specialized cell types of the colon, we isolated intestinal epithelial cells (IECs; CD326+) and matched lamina propria immune cells (CD3+ T cells, CD20+ B cells, CD33+CD14-resident intestinal macrophages, CD33+CD14+ infiltrating inflammatory intestinal macrophages) by flow cytometry (Supplementary Figure 2) from macroscopically uninflamed tissue from adult patients with CD (N=11-20) and NIBD controls (N=8-16). While relative miR-31 expression levels based on qRT-PCR were increased in B cells and resident macrophages isolated from CD patients compared to NIBD controls (P < 0.05, 2-tailed unpaired Student’s t test), these results were dwarfed in comparison to the increase seen in IECs (~52-fold difference, P = 1.28 × 10−8; Figure 3D).
To evaluate this finding further, we established three-dimensional epithelial colonoids from crypts isolated from both CD patients and NIBD individuals. These structures contain crypt-like domains reminiscent of the gut epithelium, and they continuously produce all cell types found normally within the intestinal epithelium23 (Figure 4A and 4B). We found that colonoids from CD patients express significantly higher levels of miR-31 compared to NIBD controls, similar to the primary tissue from which the colonoids were derived (Day 2 P = 0.041, 2-tailed unpaired Student’s t test; Day 6 P = 0.0095, 2-tailed unpaired Student’s t test; Figure 3E). These results suggest upregulated miR-31-5p is not a transient result due to external signalling but is a predisposing factor in IECs of CD patients. Disruption of the intestinal epithelial barrier is a critical determinant of the predisposition to chronic inflammation and fibrosis seen in CD. Going forward these data open up the potential to understand the impact of miR-31 on barrier function.
Measure miR-31 expression in formalin-fixed paraffin-embedded (FFPE) tissue from treatment-naïve pediatric CD samples
The molecular profiles we have generated and analyzed in fresh tissue and cells from adult CD represent a fundamental advance in understanding adult CD heterogeneity. At the time of this analysis, though, these adult patients had progressed to medically refractory disease, each with individual treatment histories that could potentially confound results. Therefore, as a next step, we performed smRNA-seq on microscopically uninflamed FFPE mucosal tissue from ascending colon and terminal ileal biopsies in age-matched treatment-naïve pediatric patients with CD (n=76) and NIBD controls (n=51) obtained at the time of diagnosis (index colonoscopies). It is important to note that this is not a validation cohort of the adult CD, but rather a completely independent analysis that offers at least five unique advantages. Firstly, as noted above, these samples are from treatment-naïve individuals, which greatly mitigates the potential confounding effects of treatment history that may be present in adults. Secondly, the samples are FFPE as opposed to fresh frozen tissue. Successful molecular subtyping of CD patients using FFPE tissue will greatly expand our ability in the future to analyze retrospectively the clinical characteristics associated with subtypes, given that most tissue biopsies are bioarchived as FFPE. Thirdly, the number of samples is substantially greater than in our adult CD study, affording additional power for molecular subtyping.
Fourthly, we have matched ileum and colon biopsies from the same patient allowing for the interrogation of site specific changes and impact on disease phenotype. Finally, these tissue samples are index biopsies, obtained at the time of diagnosis and prior to significant disease progression, which provides a unique opportunity to determine whether miR-31 expression is associated with the development of CD phenotypes.
As in the adult cohort, we found that the levels of miR-31 expression in the colon are significantly upregulated in CD patients relative to NIBD controls (~7.8-fold, P = 4.64 ×10−7, 2-tailed unpaired Student’s t test; Figure 4A and Supplementary Figure 5). We observed that miR-31 expression in the ileum is also significantly upregulated in CD patients (P = 9.97 × 10−7, ~1.5-fold), however the effect is not nearly as pronounced as in the colon (Figure 4B). This may be due in part to significantly higher baseline miR-31 expression levels in the ileum of unaffected (NIBD) individuals compared to in the colon (P = 5.71 × 10−28; Supplementary Figure 6 and Supplementary Table 7).
Using miRNA expression data from the 100 most variable miRNAs, we independently performed PCA on the colon (Figure 4C) and ileum (Figure 4D) pediatric samples and observed a robust separation of NIBD and CD patients. Notably, miR-31 is the largest contributor to this stratification in the colon, but not in the ileum (Supplementary Tables 8 and 9). This indicates that specifically colonic miR-31 is a primary marker of disease presence.
Test for colonic miR-31 expression in index biopsies with the development of specific CD phenotypes
We investigated whether colonic miR-31 levels were associated with the eventual development of specific CD phenotypes, and tested for association with clinical features both at the time of diagnosis and across disease course (Table 1). We first analyzed pediatric NIBD samples and found that all colon samples but one had miR-31 levels < 150 RPMMM and all ileum samples had mir-31 levels > 150 RPMMM (Supplementary Figure 7). Using this threshold, we defined two distinct subgroups within our colonic pediatric CD samples as “miR-31-low” (n = 46) and “miR-31-high” (n = 30). MiR-31 expression was validated in our two subgroups through qRT-PCR of a subset of low-(n = 7) and high-miR-31 (n = 7) samples (r = 0.94, P = 3.89 × 10−7; Supplementary Figure 8).
We then studied prospectively the clinical characteristics of only pediatric patients that presented with inflammation at time of diagnosis (i.e., no initial stricturing, penetrating disease). Since all patients were treatment naïve, we defined stricturing CD as primary (not anastomotic) fibrostenotic stricture of the terminal ileum where medical treatment would be ineffective, and therefore, surgical resection was considered a reasonable treatment option. These were diagnosed based on physician preference of using standard endoscopy and/or computed tomography (enterography) (CTE) or magnetic resonance imaging (MRI) and correlation with patient symptoms. Low miR-31 expression is significantly associated with the eventual development of ileal stricturing (P = 0.001) and having surgery involving an anastomosis (P = 0. 048). Remarkably, we found that no miR-31-high patients progressed to develop a stricturing phenotyping. To our knowledge, this data provides the first evidence for the potential clinical utility of miRNA profiling to predict increased risk of the development of stricturing phenotype in patients with Crohn’s disease. Lastly, we assessed disease location in all patients and found that significantly more miR-31-high patients exhibited rectal inflammation (P = 0.019).
Discussion
We identify colonic miR-31 expression as central to clinically-relevant molecular subtypes found in independent cohorts of adult and treatment-naïve pediatric patients. Notably, low levels of miR-31 in pediatric patients at the time of diagnosis are indicative of increased risk for development of ileal stricturing complications. Our study introduces small RNAs as potential predictors of disease phenotype and, with use of FFPE samples, offers distinct advantages over mRNA studies in the context of fresh tissue. These findings are reminiscent of early descriptions of transcriptomic signatures in breast cancer24. Further large-scale studies of gene expression profiles in breast tumors, including those of The Cancer Genome Atlas (TCGA) project25, eventually established four major molecular classes that vary in their aggressiveness and respond differently to therapies. Similarly, diffuse large B cell lymphoma26, glioblastoma27, endometrial cancer28, and lung cancer subtypes29 have been identified by genomic profiling, facilitating the development and application of targeted therapies (https://cancergenome.nih.gov).
Interestingly, in our original adult cohort, eight of 10 IL CD patients (high miR-31 levels) showed stricturing and only 4 of 11 CL CD patients (low miR-31 levels) had this complication1, which is the opposite direction of association observed in the pediatric cohort. We also found that rectal involvement is strongly associated opposite subtypes in the adult and pediatric cohorts; specifically, high miR-31 expression in pediatric patients associated with rectal inflammation while it is low miR-31 expression in adults that is associated with the presence of medically refractory rectal inflammation and need for a total colectomy. These results raise three important possibilities. First, pediatric and adult CD patients may represent two independent cohorts with distinct clinical disease implications of molecular miR-31 expression. If this is the case, we would expect to continue to find molecular characteristics unique to each of these populations that explain their distinct Crohn’s diseases. Alternatively, we might postulate that since molecular assays were performed at an early stage of disease development in pediatric patients, but at an advanced stage in the adult population, the differences we see may reflect aspects of disease evolution and treatment. This raises the possibility that the molecular programming of intestinal cells evolves over the course of the disease to reflect the changing intestinal environment. Thus, molecular levels at initial diagnosis that predict disease progression may not be maintained once disease has actually progressed30. Lastly, our study and other similar next generation sequencing studies highlights genomic capture as a snap shot in time of a very dynamic human gene regulatory system.
To address these questions; long-term longitudinal studies will need to be conducted with serial quantification of genomic profiles over the course of disease evolution in pediatric patients transitioning into adulthood. It is also imperative we understand the unique characteristics of disease presentation and evolution in adult patients, impacted by major life style and environmental factors, each uniquely contributing to colonic miR-31 regulation and its impact on phenotype. Medical management of both fibrostenotic ileal disease and rectal disease is unpredictable and in many cases is not long lasting. Thus, results from these longitudinal studies may eventually impact treatment designs for these difficult disease phenotypes. The robust establishment of CD subtypes may also influence future design of clinical trials where subtypes can be considered during patient randomization, allowing for better evaluation of subtype identification when making therapeutic decisions.
Recent studies have started to unravel the molecular mechanisms associated with distinct disease phenotypes. Genetic variants in NOD2, MHC, and MST1 3p21 were shown to be associated with disease location (colonic CD, ileal CD and ulcerative colitis (UC)) but not disease behaviour3. But, the genetic contribution to CD pathogenesis has been shown to be disproportionate, ranging from most impactful in very early onset IBD31 (VEOIBD) to modest significance in older pediatric and adult IBD patients32-35. In rectal tissue from pediatric patients, expression patterns of IL-13, IL23A, and IL17 distinguished colonic CD from UC36. Also, a lipid metabolism related gene expression signature in the ileum of pediatric CD patients accurately predicted 6-month steroid-free remission4. Follow-up studies of these same ileal samples showed a distinct collagen and extracellular matrix gene expression signature present at time of diagnosis in a subset of patients who developed fibrostenotic ileal disease5. Interestingly, our prior analysis of these patients identified an association between these same pathways in the ileum and the CL molecular phenotype1. Moving forward, the challenge is to define molecular subtypes while also uncovering the cell type-specific genetic, molecular, and environmental contributors to each subtype.
This current study along with our previous study have now shown that whole genome mRNA, miRNA, and lncRNA transcript levels, along with the open chromatin landscape, define two molecular adult CD subtypes. In addition, miRNA expression patterns can stratify pediatric CD. Together, these findings suggest that across CD patients, colonic tissue is altered in different ways at a cellular level supporting the idea of multiple Crohn’s diseases. This also underscores the necessity for a more complete molecular characterization of CD across larger populations to uncover additional distinct subtypes. We advanced our work into FFPE tissue which opens the possibility to increase sample numbers, perform longitudinal follow-up studies, and facilitate the association of molecular markers to disease course.
We demonstrate miR-31 to be specifically dysregulated in colonic epithelial cells. Breakdown in the intestinal barrier is critical to intestinal chronic inflammation; a hallmark of CD. MiRNAs, including miR-31, are known to have significant contributions to gastrointestinal epithelial barrier function37. Dicer1 deficient mice display colonic barrier integrity dysfunction as evidenced by lymphocyte and neutrophil infiltration as well as mis-localization of the tight junction protein Claudin-76. In the esophagus, Hussey et al. found that miR-31 is one of only a few differentially expressed miRNAs in post-ablation epithelium with increased barrier permeability38,39. In the colon, Wu et al. postulate that lowly expressed miR-31 plays a protective role after hypothermic ischemia induced barrier dysfunction in the colon, perhaps aiding in post-injury healing, specifically by targeting the hypoxia inducible factor (HIF)-factor inhibiting HIF (FIH-1) pathway40. Using combinational computational methods to predict miR-31 target-pathways, one group found a connection specifically between miR-31 and tight junctions in lung epithelium41. Most recently, Yu et al. demonstrated using in vivo knock-in and knock-out models that miR-31 plays a role in regulating intestinal stem cell behavior during regeneration after radiation injury42. We show that patient crypt-derived colonoids in a sterile environment retain the aberration in miR-31 expression present in the tissue of origin, which supports a cellular defect that is intrinsic and not secondary to inflammation or other external signals due to the presence of disease. The colonoid experimental system will enable future studies to interrogate the role(s) of specific factors in driving a rectal versus a fibrostenotic phenotype, especially in the context of coculture with lamina propria immune cells, mesenchymal cells, as well as stimulation with commensal and/or colitogenic bacteria.
In summary, we provide the most comprehensive molecular characterization of CD to date. We uncover miR-31 as an identifier of CD, but more importantly a molecular stratifier of both pediatric and adult patients, an indicator of established disease phenotype in adult patients, and a predictor of clinical phenotype at the time of diagnosis in pediatric patients. These findings represent significant progress in molecularly defining the Crohn’s disease(s), moving closer toward potential personalization of therapy and improving outcomes.
Footnotes
↵δ co-corresponding authors
Disclosures: For all authors, there is no conflict of interest to disclose.
Transcript Profiling: Gene Expression Omnibus accession: GSE101819, GSE85499. dbGaP accession: phs001418.v1.p1
Grant Support: This work was supported by Crohn’s and Colitis Foundation (CCF) Career Development Award (SZS), R01-ES024983 from NIEHS (SZS and TSF), 1R01DK104828-01A1 from NIDDK (SZS and TSF), P01-DK094779-01A1 from NIDDK (SZS), P30-DK034987 from NIDDK (SZS), 1-16-ACE-47 ADA Pathway Award (PS), UNC Nutrition Obesity Research Center Pilot & Feasibility Grant P30DK056350 (PS), CCF PRO-KIIDS NETWORK (SZS and PS), UNC CGIBD T32 Training Grant from NIDDK (JBB), T32 Training Grant (5T32GM007092-42) from NIGMS (MH), SHARE from the Helmsley Trust (SZS). The UNC Translational Pathology Laboratory is supported in part, by grants from the National Cancer Institute (3P30CA016086), the UNC University Cancer Research Fund (UCRF) (PS).
Author names in bold designate shared co-first authorship
Author Contributions: BPK acquired, analyzed and interpreted data, prepared figures, drafted and revised the manuscript. JBB acquired, analyzed and interpreted data and revised the manuscript. NK acquired, analyzed and interpreted data; GRG, MSS, MH, SSS, OKT, PAC, TT, and NA acquired data; and WAP and MK analyzed data. NDS, EAB, NS, TSS, MJK, DGT and FS provided help with tissue acquisition and patient phenotyping. TSF & PS designed the study, analyzed and interpreted the data, drafted and revised the manuscript, and obtained funding. SZS conceptualized and designed the study, acquired the data, interpreted data, drafted and revised the manuscript, obtained funding, acted as study sponsor, and supervised the study. All authors uphold the integrity of the work, have had final approval of the manuscript in its entirety, and are accountable for all aspects of the work.