An epigenome-wide analysis of DNA methylation, racialized and economic inequities, and air pollution

Importance: DNA methylation (DNAm) provides a plausible mechanism by which adverse exposures become embodied and contribute to health inequities, due to its role in genome regulation and responsiveness to social and biophysical exposures tied to societal context. However, scant epigenome-wide association studies (EWAS) have included structural and lifecourse measures of exposure, especially in relation to structural discrimination. Objective: Our study tests the hypothesis that DNAm is a mechanism by which racial discrimination, economic adversity, and air pollution become biologically embodied. Design: A series of cross-sectional EWAS, conducted in My Body My Story (MBMS, biological specimens collected 2008–2010, DNAm assayed in 2021); and the Multi Ethnic Study of Atherosclerosis (MESA; biological specimens collected 2010–2012, DNAm assayed in 2012–2013); using new georeferenced social exposure data for both studies (generated in 2022). Setting: MBMS was recruited from four community health centers in Boston; MESA was recruited from four field sites in: Baltimore, MD; Forsyth County, NC; New York City, NY; and St. Paul, MN. Participants: Two population-based samples of US-born Black non-Hispanic (Black NH), white non-Hispanic (white NH), and Hispanic individuals (MBMS; n=224 Black NH and 69 white NH) and (MESA; n=229 Black NH, n=555 white NH and n=191 Hispanic). Exposures: Eight social exposures encompassing racial discrimination, economic adversity, and air pollution. Main outcome: Genome-wide changes in DNAm, as measured using the Illumina EPIC BeadChip (MBMS; using frozen blood spots) and Illumina 450k BeadChip (MESA; using purified monocytes). Our hypothesis was formulated after data collection. Results: We observed the strongest associations with traffic-related air pollution (measured via black carbon and nitrogen oxides exposure), with evidence from both studies suggesting that air pollution exposure may induce epigenetic changes related to inflammatory processes. We also found suggestive associations of DNAm variation with measures of structural racial discrimination (e.g., for Black NH participants, born in a Jim Crow state; adult exposure to racialized economic residential segregation) situated in genes with plausible links to effects on health. Conclusions and Relevance: Overall, this work suggests that DNAm is a biological mechanism through which structural racism and air pollution become embodied and may lead to health inequities.


Introduction
Recent advances enabling large population-based epigenetic studies are permitting researchers to test hypotheses linking socially-patterned exposures, gene regulation, and health inequities [1][2][3] .DNA methylation (DNAm) is a plausible biological mechanism by which adverse social exposures may become embodied 4,5 , because 1) it plays an active role in genome regulation [6][7][8] , 2) it changes in response to environmental exposures 1,9 and internal human physiology like ageing 2 and inflammation 10 , and 3) induced changes can be long-lasting [11][12][13][14] .There is a growing literature reporting associations between DNAm and environmental factors to which social groups are unequally exposed; a recent review 15 found associations between DNAm and measures of socioeconomic position (SEP), including income, education, occupation, and neighbourhood measures; and illustrated timing and duration of exposure is important.Exposure to toxins, including air pollution, is often inequitable between social groups 16,17   .A number of EWAS have identified associations with particulate matter [18][19][20][21] and oxides of nitrogen (NOx) 21,22   ; although there is little replication between studies, and some studies have failed to find effects of particulate matter 22,23   , NOx 23 , and residential proximity to roadways 24 .Two EWAS have each found two (non-overlapping) DNAm sites associated with experience of racial discrimination, one in first generation Ghanaian migrants living in Europe 3 , and one in African American women 25 ; given population and migration differences the lack of replication is perhaps not surprising.
However, no EWAS has yet examined associations between DNAm and exposure to racial discrimination and economic adversity, both at individual and structural levels, and measured at different points in the lifecourse, in the same group of people.This is important because it is not clear if the different timing, duration, and levels of these adverse exposures are embodied in different ways involving differing biological pathways.Supporting attention to these issues is a growing body of research documenting how exposure to health-affecting factors such as toxins, quality healthcare, education, fresh food, and green spaces are determined by the way dominant social groups have structured society, which in turn results in health inequities between dominant social groups and groups they have minoritized 4,26   .Structural racism (the totality of ways in which society discriminates against racialized groups 27 ), for example, results in people of colour often disproportionately bearing the burden of adverse exposures and economic hardship

Associations
have also been shown for diabetes outcomes in the US 34 and globally 35 .
Guided by the ecosocial theory of disease distribution 4,5 , we tested the hypothesis that DNAm is a biological mechanism by which embodiment of structural racial discrimination, economic hardship, and air pollution may occur.We tested our study hypothesis using data from US-born participants in two US population based studies with similar exposure data: our primary study, the My Body My Story study (MBMS), and the Multi-Ethnic Study of Atherosclerosis study (MESA), which we use for evidence triangulation 36 due to differences between the two study populations.

Social exposures
We tested the relationship between DNAm and eight variables relating to exposure to racial discrimination (both structural and self-reported), economic hardship, and air pollution; these are described in detail in Supplementary Table 2.

DNA methylation
For detailed description of DNA extraction and DNAm data generation, please see the Supplementary materials.Briefly, for MBMS DNA was extracted from frozen blood spots in 2021, and data were generated using the Illumina Infinium MethylationEPIC Beadchip.For MESA, DNA was extracted from purified monocytes in 2012-2013 and data were generated using the Illumina Infinium HumanMethylation450 BeadChip.We used DNAm beta values for both studies, which measure DNAm on a scale of 0 (0% methylation) to 1 (100% methylation).

EWAS
All EWAS were conducted using linear regression models implemented using the R package meffil 38 .
Many exposures had low levels of missing data, complete case numbers for each EWAS can be found in Table 3. EWAS details can be found in the Supplementary materials; briefly, we adjusted for age, reported gender (MBMS)/sex (MESA), smoking status, blood cell count proportions, and batch effects.

Sensitivity analysis
In MESA we conducted a sensitivity analysis to test whether our results were influenced by population stratification; details are in the Supplementary materials.Additional sensitivity analysis restricted the MESA analysis to participants recruited from the Baltimore and New York sites, because these cities bear the greatest similarity to the Boston area in terms of geographical location, city environment, and social histories.

Meta-analysis
We meta-analysed associations with air pollution within MBMS and within MESA because air pollution is the only exposure we tested that we would hypothesize to have the same meaning, and therefore biological effect, for all individuals.We used METAL 39 to meta-analyse effect sizes and standard errors of the EWAS summary statistics of each racialized group, for black carbon/LAC and NOx.

Functional relevance of sites passing the genome-wide threshold
For DNAm sites associated with an exposure, we used the UCSC genome browser to identify genomic regions.For sites within known genes, we used GeneCards (https://www.genecards.org/) and literature searches to identify putative gene functions.We used the EWAS Catalog to determine if associations between DNAm sites and other traits had been reported in previous studies.Where multiple DNAm sites were associated with an exposure we performed gene set enrichment analysis using the missMethyl R package 40 .

Biological enrichments of top sites
Following each EWAS we performed analyses to ascertain whether DNAm sites associated with our exposures indicate effects on particular biological pathways, processes or functions.Details are in the Supplementary materials; briefly, we conducted gene set enrichment analyses, and for enrichments of tissue-specific chromatin states, genomic regions and transcription factor binding sites (TFBS).

Lookup of associations in a priori specified genomic locations
We hypothesised a priori that our EWAS would detect DNAm sites that have been robustly associated with our study exposures, or factors that might relate to our exposures, in previous studies.See Supplementary materials for details.

Participant characteristics
Both cohorts include racialized groups that are underrepresented in epigenetic studies.Beyond this, substantial differences existed between the racialized groups within and across MBMS and MESA.
Overall, MBMS participants were on average 21 years younger than MESA participants, had less variability in exposure to air pollution, and far more were current smokers.In both studies, Black NH compared to white NH participants had higher BMI, rates of smoking, impoverishment, lower 2 Participants' ratio of household income in 2010 dollars to the US 2010 poverty line given household composition.
3 Census tract measure of economic and racialized segregation, scored from -1 to 1 4 NOx measurements were used to construct a weighted score of roadway pollution 5 Validated self-report questionnaire measuring the number of domains of exposure to racial discrimination.Score range 0-9, categorised into 0, 1-2, 3 6 Validated self-report questionnaire measuring the number of domains of exposure to racial discrimination; combined with the attribution aspect from EDS (everyday discrimination scale) to enable comparability between EOD and MDS.Score range 0-5, categorised into 0, 1-2, 3+

EWAS results and biological interpretation
In MBMS, among the Black NH participants one DNAm site, in ZNF286B, was associated with being born in a Jim Crow state.Another DNAm site, PLXND1, was associated with participants having less than high school education.Among white NH participants, no associations passed the genome-wide threshold.See Table 2 for details of gene functions; Table 3  FOS is an early-response gene, and is a subunit of the AP-1 transcription factor complex, which regulates gene expression involved in lung injury, repair and transformation 45 , as well as regulating many cytokine genes and T-cell differentiation 46,47    2 See text; these 53 sites were driven by air pollution differences between individuals born and not born in a Jim Crow state. 3The two EWAS for parental education were not run for Hispanic 185 participants in MESA, due to small cell numbers. 4The EWAS was not run for MDS (score of 1-2 vs 3+) for white NH participants in MESA, as no participants had a score of 3 or more. 5 Table 4: Summary of EWAS results for MESA subgroup analysis. 1The EWAS for Jim Crow birth state was not run for Hispanic participants due to small cell numbers. 2 The EWAS for parental and participant education were not run for Hispanic participants, due to small cell numbers. 3The EWAS was not run for MDS (score of 1-2 vs 3+) for white NH and Hispanic participants in MESA, as no participants had a score of 3 or more.

Meta-analysis
Meta-analysis in MBMS did not yield any sites passing the genome-wide threshold.In MESA we see approximately similar numbers of associations as with the Black NH subgroup (17 for LAC and 18 for NOx); see Supplementary Table 3.When we restricted to participants recruited at the Baltimore and New York sites, a much larger number of DNAm sites passed the genome-wide threshold (51 for LAC and 79 for NOx); this may be because Minnesota and Forsyth County sites had very low variance in pollution levels.The MESA sensitivity meta-analysis identified multiple associations linked to DUSP1, FOS, KLF6, MCL1, and VIM; genes that have putative roles in inflammation and immunity.

Biological enrichments of exposure associations Gene ontology
We observed no evidence for gene set enrichments for any Gene Ontology terms among the top 100 sites of the main EWAS we conducted.However, we did observe that the 22 sites associated with NOx above the genome-wide threshold among MESA Black NH participants were enriched for the gene ontology terms 'response to glucocorticoid' and 'response to corticosteroid' (FDR>0.05).We also observed that the MESA meta-analysis of NOx among all participants was associated with 13 Gene Ontology terms (FDR>0.05)related to blood-based immune response.

EWAS catalog
We observed a number of relevant enrichments among sites identified in our EWAS.Details of the associations (p < 0.05, Fisher's exact test) can be found in the Supplementary Materials and Supplementary figures 6-16.Briefly, in MBMS, we see enrichment for inflammation for both NOx and LAC EWAS among Black NH participants, and in the NOx meta-analysis.In MESA, we observed consistent enrichment for infection and cancer among Black NH and white NH participants, and also in the meta-analyses.We observed enrichment for inflammation among Hispanic participants.We also found among both MESA Black NH and Hispanic participants, the racialized economic segregation EWAS was enriched for neurological traits.Among both the white NH and Hispanic participants, household poverty to income ratio EWAS was enriched for SEP and education.In the MESA subgroup analysis, enrichment for prenatal exposures was observed for the Jim Crow birth state EWAS among the Black NH and Hispanic participants.

Enrichment for genomic features
When we looked at enrichment of genomic locations of the top 100 sites (p < 0.05, Fisher's exact test), we found that among MBMS Black NH participants, NOx was the only exposure with associated CpGs being located in active genomic regions (please see Supplementary materials for details).In the MBMS meta-analyses, NOx was enriched for regions related to gene promoters.
Among MESA Black NH participants, we observe enrichment for regions related to transcription and genome regulation in the LAC and NOx EWAS.We also observed enrichment relating to transcription regulation for the birth in a Jim Crow state EWAS.Among MESA white NH participants, we observed enrichment for transcription regulation for both measures of air pollution.Among MESA Hispanic participants, LAC exposure shows some associations with active genomic regions.When we restrict MESA to the New York and Baltimore sites, we see a similar set of enrichments; and in the MESA meta-analyses we see consistent enrichment related to transcription regulation and promotors.
Notably, genomic feature enrichments for NOx among both MBMS and MESA Black NH participants involved similar genomic locations (CpG islands and shores) and chromatin states (related to promotors), as well as 6 of a possible 9 TFBS.

Lookup of associations in a priori specified genomic locations
We did not observe any associations in our EWAS results for sites identified in previous EWAS of related exposures.

Discussion
The series of EWAS we conducted on a range of adverse exposures at different levels and at different points in the lifecourse, drawing on two different population-based studies with similar exposure data, provide evidence that DNAm may be a biological pathway by which societal context shapes health inequities.This work has shown for the first time associations between DNAm and multiple levels of structural discrimination, in genes that are biologically plausible routes of embodiment involving gene regulation, including inflammation.Additionally, our EWAS and metaanalyses of air pollution showed clear association between two road traffic-related measures of air pollution, and DNAm of multiple CpGs in multiple genes that have been consistently associated with inflammation and infection, suggesting that the material environment people live may induce inflammatory changes.Our study has added to the existing literature on air pollution; there are few EWAS studies looking at NOx (n=3), and none so far looking at black carbon.In total, this work Notably, inflammation was the predominant pathway indicated in the air pollution analyses, both via putative gene functions and enrichment analyses.These findings underscore that while there is a large psychosocial literature on inflammation being a mechanism by which discrimination harms health 28,55,56 , it is also critical to consider inequities in biophysical exposures in the material world as an important driver of this inflammation.Overall, air pollution sites tend to be enriched for inflammation in MBMS and infection in MESA; this could represent different mechanisms of the same process due to the different blood cell types sampled in the two cohorts; with monocytes being specialised in infection prevention, and neutrophils (the highest proportion cell in whole blood) being specialised in inflammatory responses.
Our study identified a greater number of associations with air pollution measures than previous work in MESA 20,57 ; this is likely due to the fact that we do not adjust for recruitment site (which would reduce variation in the exposure because exposure is location-dependent); and previous analyses have adjusted for racialized group membership, which is also associated with air pollution exposure; this may have masked the effects that we have detected.This joins other research that has demonstrated the importance of considering spatial effects of air pollution 58 .
A limitation of our study is that we cannot infer causality.Although it would be possible to conduct Mendelian randomization instrumenting cis-mQTLs, we did not conduct this analysis because we think the results would be highly speculative.Additionally, the MESA sample we used may have been subject to selection bias, because (1) individuals who had experienced prior cardiovascular events were excluded from recruitment, and (2) a number of participants died between Exam 1 and Exam 5.
If adversity and discrimination are associated with these cardiovascular events and mortality, associations could be biased in MESA.

Conclusions
We think this work provides direction for future epigenetic studies to consider the role of inequitable adverse social and biophysical exposures across the lifecourse, including but not limited to structural discrimination.Our results suggest inflammation may be a key biological pathway by which inequities become embodied, in our case driven primarily by exposure to air pollution, and not self-reported racial discrimination.These findings accordingly suggest that attention to how social inequities shape biophysical as well as social exposures is crucial for understanding how societal inequities can become embodied, via pathways involving DNAm.

Declaration statements Ethics approval
The study protocol, involving use of both the MBMS and MESA data, was approved by the Harvard

Data sharing
This study (NIH Grant number R01MD014304) relied on three sources of data, each of which is subject to distinct data sharing stipulations: (1) the non-public data from the "My Body, My Story" (MBMS) study; (2) the non-public data from the Multi-Ethnic Study of Atherosclerosis (MESA; data use agreement G638); and (3) the public de-identified data from the US Census, the American Community Survey, and the State Policy Liberalism Index.We provide descriptions of these data • CR co-led obtaining funds for the research project, co-designed the analyses, and contributed to interpreting the results.
• NK led conceptualization of the study, contributed to designing the analyses, and co-led obtaining funds for the research project.
• CT accessed the electronic public use data and generated the study variables derived from these data, contributed to designing the analyses and interpreting the results, and prepared the study materials to be shared via the data repository (data dictionary; software code).
• JTC contributed to designing the analyses and interpreting results.
• PDW facilitated finalizing all human subject approvals and data use agreements and also the data transfer of the MBMS epigenetic data from HSPH to Bristol, geocoded the place of birth data, extracted the historical census data from PDF files, and contributed to interpreting results.
• AS, BC, KT, and GDS contributed to designing the analyses and interpreting the results.
• IDV led and supervised the assays to extract epigenetic data from the MBMS blood spots and contributed to designing the analyses and interpreting the results.
• ADR facilitated interpretation of the MESA data and contributed to designing the analyses and interpreting the study results.
• All co-authors provided critical intellectual content to and approved the submitted manuscript.

Funding
The Affymetrix (Santa Clara, California, USA) using the Affymetrix Genome-Wide Human SNP Array 6.0.

Figure 1 :
Figure 1: MESA air pollution meta-analysis miami plots.A: MESA full cohort LAC meta-analysis.B: MESA full cohort NOx meta-analysis.C: MESA subgroup LAC meta-analysis.D: MESA subgroup NOx meta-analysis.
highlights the need for researchers to consider multiple levels of discrimination and adversity across the lifecourse, especially structural inequities in the material world in which people live, to fully elucidate drivers and biological mechanisms of inequitable health.Associations detected at the genome-wide level in MBMS related more closely to early-life exposures (being born in a Jim Crow state, and low educational attainment); in MESA they related more to current experiences and exposures (air pollution, racialized economic segregation, and experiences of discrimination), possibly reflecting the relatively older age of the MESA participants.The much stronger associations with air pollution in MESA compared to MBMS could potentially be due to: (1) the use of purified monocytes in MESA, with a single cell type making associations easier to detect; (2) less variation in exposure to air pollution in MBMS compared MESA; (3) longer duration of air pollution exposure in MESA (due to older age of the participants); or (4) reduced statistical power in MBMS, due to lower quantities of DNA.
work for this study was supported by the National Institutes of Health, National Institute of Minority and Health Disparities [grant number grant R01MD014304 to N.K. and C.R., as MPIs].Additionally, GDS, CR, KT, MS, SHW work within the MRC Integrative Epidemiology Unit at the University of Bristol, which is supported by the Medical Research Council (MC_UU_00011/1,3, & 5 and MC_UU-00032/1).The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.Prior funding supporting our work: a) The prior MBMS study was funded by the National Institutes of Health/National Institute on Aging (R01AG027122) and generation of the black carbon data for this study was supported by a pilot grant from the Harvard NIEHS Center (P30 ES000002).b) No additional grant funding to MESA was obtained for the conduct of this study whose Data Use Agreement (G638) was approved by MESA (on 11/22/2019) to use pre-existing MESA data.Support for generating these pre-existing MESA data is provided by contracts 75N92020D00001, HHSN268201500003I, N01-HC-95159, 75N92020D00005, N01-HC-95160, 75N92020D00002, N01-HC-95161, 75N92020D00003, N01-HC-95162, 75N92020D00006, N01-HC-95163, 75N92020D00004, N01-HC-95164, 75N92020D00007, N01-HC-95165, N01-HC-95166, N01-HC-95167, N01-HC-95168, N01-HC-95169, UL1-TR-000040, UL1-TR-001079, UL1-TR-001420, UL1-TR-001881, and DK063491.The MESA Epigenomics & Transcriptomics Studies were funded by NIH grants 1R01HL101250, 1RF1AG054474, R01HL126477, R01DK101921, and R01HL135009.This publication was developed under the Science to Achieve Results (STAR) research assistance agreements, No. RD831697 (MESA Air) and RD-83830001 (MESA Air Next Stage), awarded by the U.S Environmental Protection Agency (EPA).It has not been formally reviewed by the EPA.The views expressed in this document are solely those of the authors and the EPA does not endorse any products or commercial services mentioned in this publication.Funding for MESA SHARe genotyping was provided by NHLBI Contract N02-HL-6-4278.Genotyping was performed at the Broad Institute of Harvard and MIT (Boston, Massachusetts, USA) and at

Table 1 :
education, rates of self-reported exposure to racial discrimination, and were more likely to be born in a Jim Crow state and live in a neighbourhood with extreme concentrations of low-income persons of colour.In MESA, Hispanic participants reported the lowest levels of personal and parental education.Characteristics of MBMS and MESA participants.
for numbers of associated EWAS sites; and Supplementary figures 1 and 2 for Miami plots.In MESA, two DNAm sites were associated with racialized economic segregation -one in Black NH participants (in FUT6) and one in white NH participants (a CpG previously associated with BMI); and in Black NH participants one DNAm site (in PDE4D) was associated with an MDS score of 0. The majority of associations in MESA were related to air pollution exposure -among Black NH participants, 12 sites with LAC and 22 sites with NOx.
Notably, many of these sites are clustered in genes with putative roles in immune responses and are known to interact with one another, including KLF6, MIR23A, FOS, FOSB, ZFP36 and DUSP1.Among the MESA white NH participants, four DNAm sites were associated with both LAC and NOx, and an additional 3 uniquely associated with LAC.Associations of 53 DNAm sites with birth in a Jim Crow state were the result of confounding by air pollution (see Supplementary Materials).Among Hispanic participants, one site was associated with LAC (NPNT) and one with NOx (ADPRHL1).See Table3for numbers of associations for all EWAS performed and Supplementary Figures3-5for corresponding Miami plots. .

Table 2 :
Putative functions of genes in which the top exposure-associated DNAm sites sit; with details of how many sites within that gene were identified, and in which main analysis EWAS

Table 3 :
Summary 1f the number of DNAm sites passing the genome-wide threshold in each individual EWAS in MBMS (threshold 2.4e-7) and MESA (threshold 9e-8).The list of specific DNAm 183 sites passing the genome-wide threshold can be found in supplementary table4.1TheEWAS was not run for Jim Crow birth state for white NH participants in MBMS, due to small cell numbers.
The main impact of removing the Minnesota and Forsyth County sites (which both had very low levels of air pollution) was to remove the confounding structure between air pollution and Jim Crow birth state among white NH participants.It also increased the similarity of air pollution associations between the Black NH and white NH participants; for example, of the 19 DNAm sites associated with NOx among white NH participants, 12 passed the genome-wide threshold in the Black NH participant EWAS.Numbers of associated sites are in Table4.Miami plots for this MESA subgroup can be found in Supplementary figures 6, 7 and 8.