Skip to main content

Epigenome-wide association study of kidney function identifies trans-ethnic and ethnic-specific loci



DNA methylation (DNAm) is associated with gene regulation and estimated glomerular filtration rate (eGFR), a measure of kidney function. Decreased eGFR is more common among US Hispanics and African Americans. The causes for this are poorly understood. We aimed to identify trans-ethnic and ethnic-specific differentially methylated positions (DMPs) associated with eGFR using an agnostic, genome-wide approach.


The study included up to 5428 participants from multi-ethnic studies for discovery and 8109 participants for replication. We tested the associations between whole blood DNAm and eGFR using beta values from Illumina 450K or EPIC arrays. Ethnicity-stratified analyses were performed using linear mixed models adjusting for age, sex, smoking, and study-specific and technical variables. Summary results were meta-analyzed within and across ethnicities. Findings were assessed using integrative epigenomics methods and pathway analyses.


We identified 93 DMPs associated with eGFR at an FDR of 0.05 and replicated 13 and 1 DMPs across independent samples in trans-ethnic and African American meta-analyses, respectively. The study also validated 6 previously published DMPs. Identified DMPs showed significant overlap enrichment with DNase I hypersensitive sites in kidney tissue, sites associated with the expression of proximal genes, and transcription factor motifs and pathways associated with kidney tissue and kidney development.


We uncovered trans-ethnic and ethnic-specific DMPs associated with eGFR, including DMPs enriched in regulatory elements in kidney tissue and pathways related to kidney development. These findings shed light on epigenetic mechanisms associated with kidney function, bridging the gap between population-specific eGFR-associated DNAm and tissue-specific regulatory context.


The kidney has a central role in body homeostasis through the regulation of blood pressure, fluid, and electrolytes and by removing endogenous and exogenous toxins. Reduced kidney function measured using estimated glomerular filtration rate (eGFR) defines chronic kidney disease (CKD). CKD affects 14.5% of the adult US population and is a leading cause of death and disability [1, 2]. CKD has a high burden among non-European US ethnic groups, but mechanisms for this health disparity are poorly understood [3]. A better understanding of the mechanisms influencing kidney function may provide insights into CKD occurrence and risk.

Complex interactions between genetic, lifestyle, and environmental exposures likely contribute to the observed eGFR variation across populations. DNA sequence variation accounts for 7.6% of the estimated heritability of eGFR in trans-ethnic genome-wide association studies (GWAS) [4]. Epigenetic modifications of the genome such as DNA methylation (DNAm) are heritable and contribute to gene regulation. DNAm consists of the addition of a methyl group to cytosines, typically at cytosine-guanine dinucleotides (CpG sites). DNAm is influenced by lifetime exposures and may provide clues on ethnic-specific differences influencing eGFR. Differential DNAm at CpG sites can be studied using microarrays with reasonable genome-wide coverage through epigenome-wide association studies (EWAS) [5, 6].

Recent EWAS in whole blood have identified differentially methylated positions (DMPs) associated with blood pressure and eGFR, and disease states such as CKD and rapid decline in eGFR [7,8,9,10,11]. Early studies had modest sample sizes (40 to 407 individuals) or were limited to a single ethnic group [7,8,9]. A large EWAS performed separate discovery analyses within the Atherosclerosis Risk in Communities (ARIC, 2264 African Americans) and the Framingham Heart Study (FHS, 2395 white participants) followed by cross-replication of findings [11]. The study identified 19 DMPs for eGFR or CKD. Overall, these studies support a role for DNAm in CKD-related traits. However, previous studies did not account for important potential confounders such as smoking status and cumulative exposure in the discovery group, which have widespread effects on DNAm patterns [12] and are risk factors for CKD, nor did they assess DNAm at CpG sites across multiple ethnic groups during discovery.

The main aim of this study is to identify DNAm patterns associated with eGFR in multi-ethnic studies using data from European/European American (EA), African American (AA), and Hispanic/Latino (H/L) participants. We performed both trans-ethnic and ethnic-specific EWAS using whole blood-based Illumina DNAm data assayed in participants of the Women’s Health Initiative (WHI), the Multi-Ethnic Atherosclerosis Study (MESA), and the Jackson Heart Study (JHS). We replicated our findings in the HyperGEN, Generation Scotland, and CATHGEN studies, in addition to analyzing results via look-ups in a published study [11]. We identified DMPs associated with eGFR in trans-ethnic and ethnic-specific analyses, and provided supporting evidence for the contribution of identified DMPs to kidney function and development using in silico approaches and human kidney tissue-specific data.


Study design and populations

Our study design included a discovery step comprising three population-based studies (WHI, MESA, and JHS) and three replication studies (HyperGEN, Generation Scotland, and CATHGEN) (Additional file 1: Fig. S1). WHI is a study of postmenopausal women (aged 50–79 years), comprising 161,808 women recruited from 40 US clinical centers who participated in an observational study or in clinical trials during 1993–1998 as previously described [13,14,15,16]. MESA is a multi-ethnic study of subclinical cardiovascular disease and risk factors for cardiovascular disease [17], consisting of 6814 asymptomatic men and women aged 45–84 (38% EA, 28% AA, 22% H/L, and 12% Asian) recruited from six field centers across the USA and examined in 2000–2002, followed by four subsequent examination periods. JHS is a study of cardiovascular disease and its risk factors in AA, comprising 5306 African Americans aged 21 to 94 years recruited from the Jackson, MS, metropolitan area from 2000 to 2004, with four follow-up exams [18]. HyperGEN is a family-based study with a sib-pair design. Hypertensive African American sibships were recruited from Forsyth County, NC, and from the community-at-large in Birmingham, AL, from 1995 to 2000 [19]. Generation Scotland is a family-based and population-based study consisting of 23,690 European participants recruited via general medical practices across Scotland between the years 2006 and 2011 [20]. CATHGEN is a biorepository of clinical samples from a prospectively collected clinical cohort of individuals undergoing cardiac catheterization at Duke University [21]. Both discovery and replication included multi-ethnic studies. Study descriptions are shown in Additional file 2: Supplementary Methods.


Serum creatinine-based eGFR was estimated using the Chronic Kidney Disease Epidemiology equation which includes age, sex, and a constant for AA [22].

Epigenetic data and quality control

Briefly, preprocessing included removal of probes with detection p values > 0.01 in > 10% of samples and samples with detection p values > 0.01 in > 1% of probes. Beta values were normalized using beta-mixture quantile (BMIQ) normalization method (WHI-BAA and WHI-EMPC) [23], the normal-exponential out-of-band (NOOB) preprocessing method (JHS, MESA, CATHGEN) [24], Subset Quantile Normalization (SQN) (CATHGEN) [25], or dasen (Generation Scotland) [26]. Batch effect correction was performed using ComBat [27] or adjusting batch as a covariate. CpGs overlapping with the list of potentially polymorphic sites in the relevant ethnic group and cross-reactive probes were removed [28]. Cell proportions were estimated using the reference-based Houseman method for whole blood [29]. To adjust for population structure, principal components (PCs) or ethnic-informative markers were obtained from the genome-wide genotype data available using standard methods [30]. DNAm sites were annotated to include chromosome, position, UCSC gene names, relationship to CpG islands, location in gene enhancer regions, and DNase I hypersensitive sites (DHSs) using Illumina’s annotation file [31]. Detailed methods for MESA, which have not been previously published, are included in Additional file 2: Supplementary Methods.


Methylation betas were used as predictors for eGFR in ethnic-stratified analyses, adjusting for age, sex, smoking history (current, past, and no smoking, and pack-years), 4 to 10 principal components, cell type composition, and study-specific covariates. DMPs were modeled by fitting robust linear models (or linear mixed models in JHS to account for family relationships) and performing robust standard error calculations via the 'sandwich' package. These analyses were performed using R version 3.5.3. Because of the small samples within each ethnicity in MESA, we fitted linear models in that study without robust estimation. EWAS results were meta-analyzed across all samples and within each ethnicity using fixed-effect inverse-variance weighted methods implemented in METAL. We required a minimum of two studies for meta-analyses. We used an FDR < 0.05 (Benjamini-Hochberg).

EWAS analyses were subsequently performed in HyperGEN (AA), Generation Scotland (EA), and CATHGEN (AA and EA) using the same statistical protocols as in discovery analyses. For trans-ethnic replication, we combined all the replication samples and used a Bonferroni-adjusted p-value cutoff for the 78 tests that were performed (p = 6.4E−04). We also considered if the direction of effects between discovery and replication samples was concordant. For ethnic-specific replication, we used EA replication samples for EA or H/L discovery meta-analyses, while for AA replication, we used AA replication samples, a Bonferroni-corrected p < 2.1E−03, and consistency in direction of effects. We also attempted to replicate DMPs from a published study [11]. This published study used eGFR instead of DMPs as a predictor in models and therefore the estimates were not comparable to our study.

In silico annotation using eFORGE and eFORGE-TF, and pathway analyses

We performed functional overlap analysis of DMPs with eFORGE version 2.0, analyzing the default top 1000 probes from EA, AA, H/L, and all-ethnic probe sets for overlap enrichment across DNase I hotspots from the Roadmap Epigenomics Consortium [32, 33]. To ascertain whether the observed enrichment was robust and associated with top probe sets below the EWAS significance threshold, we applied additional eFORGE analyses across the 5 top EA CpG sets (comprising CpGs 1–1000, 1000–2000, 2000–3000, 3000–4000, 4000–5000, ordered by p-value), to detect overlap enrichment across DNase I hotspots from the Roadmap Epigenomics Consortium [32, 33]. We thus performed integrative epigenomics analyses on data from the Roadmap Epigenomics consortium [34] using the eFORGE framework ( [32, 33]. To further understand eFORGE enrichment results, we performed TF motif analysis on the probes underlying eFORGE tissue-specific enrichment signal for kidney. We used the eFORGE-TF module, seeking to uncover the main TF motifs associated with our DNase I hotspot enrichment for top sites [33]. We used PANTHER analysis via the AmiGO framework to uncover pathways associated with the identified TF motifs [35].

meQTL data in the whole blood

We used publicly available data from the biobank-based integrative omics study (BIOS) QTL database [36, 37] and cis-meQTL data from the FHS (n = 4170 participants, 450K DNAm panel, 1000G imputed data, defined within 2-Mb window) [38]. We also used meQTL data from mQTLdb [39].

meQTL in normal kidney tissue

We analyzed a total of 211 individuals with matching kidney genome-epigenome information from TRANScriptome of renaL humAn TissuE Study (TRANSLATE), an extension of the TRANSLATE study (TRANSLATE-T), Renal gEne expreSsion and PredispOsition to cardiovascular and kidNey Disease (RESPOND), and molecular analysis of mechanisms regulating gene expression in post-ischemic injury to renal allograft (REPAIR) (n = 192), in addition to normal samples with available genotype and kidney DNAm profiles from the National Institutes of Health (NIH) Tissue Cancer Genome Atlas (TCGA) (n = 19) [40,41,42,43]. Kidney tissue samples from TRANSLATE, RESPOND, and TCGA were taken from the healthy, unaffected by cancer part of the organ after elective nephrectomies, and renal specimens in TRANSLATE-T and REPAIR studies were collected as pre-implantation biopsies from deceased donors’ kidneys before transplantation [40, 42, 43]. Information on local recruitment teams, genotyping, and DNAm methods are included in Additional file 2: Supplementary Methods. In total, 374,826 CpG sites were available for further analyses after quality control filters. For post-EWAS DMP analysis, we conducted kidney cis-meQTL analysis on 62 DMPs (out of 374,826) from the study. For GWAS SNP analysis, we conducted kidney cis-meQTL analysis on a set of published eGFR GWAS SNPs [44]. The cis-meQTL analysis was conducted on 195 kidney DNA samples that passed all quality control criteria. For analysis, we used the FastQTL pipeline [45]. We used normalized M-values and genotype information for all genotyped and imputed variants passing the quality control filters under an additive mode of inheritance. Regression models included age, sex, genotyping array, source of tissue indicator (nephrectomy/kidney biopsy), the top three PCs derived from genotyped autosomal variants (genotype PCs), and six PCs derived from methylation array control probes (methylation PCs). The FastQTL cis-region was defined as ±1 Mb from each tested CpG/SNP position. For EWAS DMP blood-kidney meQTL comparison analysis, we compared DMP+kidney meQTL CpGs to blood meQTL CpGs from mQTLdb and calculated the percentage of overlap for data from both sources.

eFORGE analysis for kidney meQTL CpGs

Standard eFORGE analyses were performed using default settings. For kidney meQTL CpGs, we analyzed eGFR SNPs overlapping kidney DNase I hotspots (top category from We searched for these SNPs in the kidney meQTL file, obtaining preliminary associated meQTL CpGs (nominal p-value < 0.05) and preliminary non-associated meQTL CpGs (nominal p-value > 0.95), for the same set of SNPs. Standard eFORGE analyses were performed on both of these sets.


Overview of EWAS results

The study design is shown in Fig. 1 and Additional file 1: Fig. S1. Discovery EWAS meta-analyses included up to 5428 individuals, with 2879 AA, 1737 EA, and 812 H/L participants (Additional file 3: Table S1). Quality control of DNAm data and protocol analyses for each study are shown in Additional file 3: Table S2. We performed analyses within each study and ethnic group using standard statistical protocols followed by meta-analyses of the overall samples and within each ethnicity. The quantile-quantile and Manhattan plots for meta-analyses are shown in Additional file 1: Fig. S2. Lambdas for meta-analyses were 0.987 (AA), 1.001 (EA), 1.065 (H/L), and 1.194 (trans-ethnic meta-analysis). Across our discovery analyses, we identified a total of 93 DMPs associated with eGFR at an FDR of 0.05. Of these, 78 DMPs were identified in trans-ethnic meta-analyses, 23 DMPs in the meta-analysis of AA, 5 in the meta-analysis of H/L, and 5 in the meta-analysis of EA, with some overlap in DMP findings between trans-ethnic and ethnic-specific results.

Fig. 1

Overview of trans-ethnic and ethnic-specific CpGs associated with kidney function. a Venn diagram showing trans-ethnic and unique CpGs across the top 1000 sites for European Americans (EA), African Americans (AA), Hispanic/Latino (H/L), and trans-ethnic groups. b Euler diagram showing the number of overlapping CpGs (1) between trans-ethnic replicated DMPs (13) and discovery ethnic-specific DMPs for African Americans -AA- (5). c Study design, consortium information, sample size, and the number of significant DMPs for both trans-ethnic and ethnic-specific EWAS analyses. Details shown both for discovery (top) and replication analyses (bottom). For these analyses, consortia include the Women’s Health Initiative (WHI), the Jackson Heart Study (JHS), MESA, HyperGEN, Generation Scotland, and CATHGEN. In addition, we used kidney DNAm data from the TRANSLATE, TRANSLATE-T, RESPOND, and REPAIR studies for cis-meQTL analyses and Roadmap Epigenomics data for eFORGE DHS analyses

Indeed, most of the DMPs identified in trans-ethnic analyses were also present in one or more ethnic groups (Fig. 1a). However, the overlap between DMPs of AA and EA was small. Trans-ethnic replication included up to 8109 participants from three studies (Generation Scotland, CATHGEN, and HyperGEN), composed of participants of EA and AA (9% of replication samples) (Additional file 1: Fig. S1) [11]. Replication of EA or H/L findings included Generation Scotland and CATHGEN EA samples (n = 7349) and AA included CATHGEN and HyperGEN samples (n = 760). Among the significantly identified trans-ethnic DMPs (Bonferroni corrected p-value and consistent direction of effects), 13 of 78 replicated (Table 1), and among the significantly identified DMPs in ethnic-specific meta-analyses, 1 of 5 AA DMPs replicated (cg14871770 at CYP2C9) (Additional file 3: Table S3). Despite independent replication in AA, cg14871770 overlapped with trans-ethnic replicated DMPs (Fig. 1b). Twelve additional DMPs had replication (with consistent direction of effects) in trans-ethnic meta-analyses (Additional file 3: Table S3a). The DMP cg14871770 overlapped between trans-ethnic and AA meta-analyses, and the cg17944885 at the ZNF20-ZNF788P locus was previously described [11]. Several of the replicated DMPs were expression quantitative trait methylation loci (eQTM) or cis-meQTL CpGs in whole blood in FHS and (BIOS QTL) (Table 1) [36,37,38]. Replication results for all DMPs are shown in Additional file 3: Tables S3a (trans-ethnic analyses) and Additional file 3: Table S3b (ethnic-specific analyses). Forest plots for each of the replicated DMPs are shown in Additional file 1: Fig. S3. We also replicated 6 DMPs identified in a prior publication (DMPs located at genes DAZAP1, KIAA1549L, TUBGCP4/ZSCAN29, JAZF1, ZNF20-ZNF788P, LDB2) (Additional file 3: Table S4) [11].

Table 1 Main findings from trans-ethnic EWAS meta-analyses for 13 replicated DMPs

Overlap of eGFR-associated DMPs with genes and regulatory elements

To understand the regulatory context of our DMPs, we annotated the 13 replicated significant DMPs with the closest gene and other information including epigenomic peaks, tissue-specific gene expression via RNA-seq, and chromatin interaction annotations. One of our DMPs (cg11789371) is located in an intron of HSP90AA1, a gene involved in protecting kidney tissue from inflammation, ischemia, and oxidative damage, and assisting with cellular repair (Fig. 2) [46]. HSP90 (heat shock protein 90) has a physiological role in eGFR regulation through the nitric oxide pathway and is a drug target candidate for kidney diseases [46]. Indeed, inhibition of HSP90 has been shown to reduce eGFR in animal models [46]. DMP cg11789371 overlaps epigenomic annotations from the ENCODE consortium including DNase I hypersensitive sites in kidney cells, among other cell types. This DMP is a cis-meQTL CpG regulated by rs11621083, a variant located upstream in an intron of HSP90AA1. The DMP also forms part of a GeneHancer interacting site, contacting the promoter of WDR20 and an alternative promoter of HSP90AA1 which is located 50 kb away. The entire region surrounding this DMP contains a number of genes expressed in the kidney, several of which seem to interact with these two promoters. Genes from this locus presenting RNA-seq expression in the kidney include DYNC1H1, MOK, ZNF839, WDR20, and HSP90AA1. Taken together, annotations for our DMP cg11789371, which overlaps an intron of HSP90AA1, a gene associated with eGFR regulation, suggest a potential link between DNA methylation and eGFR regulation through the nitric oxide pathway.

Fig. 2

eGFR-associated differentially methylated position cg11789371. a HSP90AA1 gene browser shot showing (from top to bottom) genome coordinates, local genes, NHGRI/EBI GWAS catalog SNPs, GTEx gene expression quantified via RNA-seq across different tissues, H3K27ac peaks across 7 ENCODE cell lines, GeneHancer regulatory elements, Genecards TSSs, GeneHancer chromatin interactions, ENCODE chromatin accessibility and chromatin interaction tracks, and location for eGFR-associated DMP cg11789371. b Expanded browser shot showing genome coordinates, local genes, NHGRI/EBI GWAS catalog SNPs, H3K27ac peaks across 7 ENCODE cell lines, and a boxplot indicating DNAm values at cg11789371 for bottom and top quartiles of eGFR, respectively. These data indicate our DMP overlaps an intron of HSP90AA1, a gene expressed in kidney tissue, and a DHS from ENCODE, which was detected in kidney tissue. Our DMP is also proximal to an H3K27ac peak, an RNA Polymerase 2 region determined by ENCODE ChIA-PET across several cell lines, and the promoter of HSP90AA1. All browser shots were generated using the UCSC genome browser ( on human genome build hg19

DMPs for eGFR are enriched for kidney regulatory function related to kidney development

To further understand the regulatory potential and chromatin context of our EWAS findings across different tissues, we performed integrative epigenomics analyses on data from the Roadmap Epigenomics consortium [34] using the eFORGE framework ( [32, 33]. We found enrichment for kidney-specific DNase I hotspots, which has also been described in GWAS of eGFR (Fig. 3a, d) [4, 47]. eFORGE showed consistent enrichment results for kidney-specific DNase I hotspots when applied to the top EA discovery probes, showing a corresponding trend with study p-value (analyses performed using the “EPIC” setting with 1000 repetitions, BY correction). In addition, analysis with eFORGE-TF uncovered significant enrichment for several transcription factor (TF) motifs (including motifs for OSR1, OSR2, TBX1, and PAX2). OSR1, OSR2, TBX1, and PAX2 have all been shown to have roles in kidney development (Fig. 3c) [48,49,50]. To further understand the pathways underlying TF motif enrichment, we performed PANTHER pathway analysis using significant TF motifs [35]. This analysis uncovered pathways associated with kidney development, including metanephros development (8.1 × 10−3), mesonephros development (9.7 × 10−3), and retinoic acid receptor signaling pathway (6.4 × 10−3) (Additional file 3: Table S5). Overall, these findings suggest that epigenetic changes related to eGFR are enriched in kidney regulatory regions and pathways related to kidney development.

Fig. 3

Tissue-specific integrative analysis indicates potential effect on kidney and relation with eGFR GWAS loci. a eFORGE analysis for top 1000 eGFR CpGs: the x axis indicates tissues/cell type samples used in the analysis; the y axis shows eFORGE enrichment (−log10 p-value) of the CpG set with DNase I hotspots for a range of tissue samples (significant samples in black). The highest ranked sample set (highest black points) shows the most significant enrichment is for kidney samples, which are highly ranked for the top 1000 CpGs associated with eGFR. b FORGE2 analysis for eGFR SNPs from GWAS catalog: the x axis indicates tissues/cell type samples used in the analysis; the y axis shows FORGE2 enrichment (−log10 p-value) of the SNP set with DNase I hotspots for a range of tissue samples (significant samples in black). The highest ranked sample set (highest black points) shows the most significant enrichment also is for kidney samples, which are highly ranked for the top 249 SNPs associated with eGFR (taken from the GWAS catalog,, downloaded 10 April 2020). c TF motif enrichment results for EA probes driving eFORGE tissue-specific enrichment signal: the x axis indicates TF motifs from TRANSFAC, JASPAR, Taipale/SELEX, and Uniprobe databases; the y axis shows eFORGE-TF enrichment (−log10 hypergeometric p-value) of the input DMP set with TF motifs overlapping open chromatin sites for fetal kidney samples. Enrichment values for each TF motif are colored according to BY FDR-corrected q-value. A number of TF motifs involved in kidney development overlap top EA probes including OSR1, OSR2, TBX1, and PAX2. d Aggregated eFORGE results for EA probes: the x axis indicates sets of the top ranked DMPs used in the analysis (each set contains 1000 DMPs); the y axis shows eFORGE enrichment (−log10 p-value) of each of the DMP sets with open chromatin sites for kidney (red) and other tissue samples (gray). The highest ranked probe set (set 1, left) shows the most significant enrichment for kidney samples, which remain highly ranked for probe sets 2–5, in decreasing order of study p-value

eGFR GWAS variants, meQTL CpGs, and tissue-specific DNase I hotspots

Previous reports and our analyses confirm that eGFR GWAS variants are enriched for kidney DNase I hotspots (Fig. 3b, Additional file 1: Fig. S4) [4, 47], indicating that genotypes might function through kidney-specific regulatory pathways, potentially converging with identified EWAS DMPs through mechanisms which are not fully understood. To explore this further, we used mQTLdb [39] to identify meQTL CpG targets of eGFR-associated GWAS SNPs obtained from the GWAS catalog [44, 51]. These significant meQTL CpGs linked to GWAS SNPs revealed significant overlap with eGFR EWAS sites in our study (p < 0.002, Fig. 4a) and presented significant enrichment for kidney, renal cortex, and renal pelvis, among other tissues (Fig. 4b, d, Additional file 1: Fig. S5). These results support a model in which meQTL CpGs linked to eGFR GWAS SNPs overlap with EWAS DMPs and tend to localize to kidney DNase I hotspots, with potential involvement in their regulatory action (Fig. 4c). To assess these findings in human kidney tissue, we analyzed meQTL CpGs of eGFR SNPs localizing to kidney DNase I hotspots. The meQTL CpGs were identified from DNAm data from 195 normal kidney tissue samples (acquired from elective nephrectomies—taking the non-cancer affected segment—or from kidney donors) as reported previously [40,41,42]. While kidney meQTL CpGs not associated with eGFR SNPs (nominal p-value> 0.95) showed no enrichment in kidney DNase I hotspots (Additional file 1: Fig. S6), the kidney meQTL CpGs with a nominal p-value < 0.05 were enriched in kidney DNase I hotspots (Additional file 1: Fig. S7) for the same GWAS SNPs. These results support the aforementioned findings using eGFR SNP-associated meQTL CpGs from mQTLdb [39]. Additionally, we evaluated the overlap between kidney meQTL CpGs and blood meQTL CpGs (mQTLdb) for our top EWAS DMPs. The majority (58.3%) of these kidney meQTL CpGs were also mQTL CpGs in blood.

Fig. 4

eGFR EWAS CpGs present a significant overlap with eGFR GWAS-driven meQTL effects. a Histogram of 1000 random background simulations (249 random SNPs each), for EWAS-meQTL overlap across the ARIES blood meQTL dataset ( Two hundred forty-nine unique significant SNPs from the eGFR GWAS by Hellwege et al. yield 13 SNP-meQTL-EWAS DMP sites in the Aries cohort (p = 2.0E−03, empirical test, red dot and arrow), while background SNP sets overlap a mean of 0.912 SNP-meQTL-EWAS sites. b Histogram of 1000 random background simulations (249 random SNPs each), for meQTL-kidney DNase I hotspot overlap across Roadmap Epigenomics “Kidney” sample datasets ( Two thousand seven hundred thirty-three meQTL targets of 249 unique significant SNPs from the eGFR GWAS by Hellwege et al. overlap Roadmap kidney DNase I hotspots 519 times (p < 0.001, empirical test, red dot and arrow), while background SNP sets overlap Roadmap kidney DNase I hotspots a mean of 67.021 times (SD = 24.754). c Schematic showing the association of eGFR GWAS SNPs with meQTL target CpGs and eGFR EWAS CpGs (both in red text), some of which overlap kidney-specific DNase I hotspots (shown in blue, arrows indicate statistical association—not genomic contact). For comparison, a representation of a background SNP is shown. d Results from eFORGE analysis of significant ARIES meQTL CpGs associated with eGFR GWAS SNPs, indicating a higher-than expected overlap with the kidney, renal cortex, and renal pelvis DNase-seq hotspots (for additional results, see Additional file 1: Fig. S5)


Main findings

This study used trans-ethnic and ethnic-specific analyses of multi-ethnic cohorts to identify and replicate DMPs associated with eGFR. Our main findings include replicated DMPs at 13 sites from trans-ethnic analyses and 1 DMP from AA-specific analyses, which may reflect the larger AA discovery sample compared to other ethnic groups. All associations were newly identified in this study, except for the ZNF20-ZNF788P locus, which was previously described in two separate eGFR EWAS, including a population-based cohort and an HIV-infected cohort [11, 52]. Several DMPs were associated with accessible chromatin sites in kidney tissue, suggesting a regulatory role for these sites. Overall, our study identified 12 previously unreported DMPs that replicated, in addition to 6 previously published DMPs for eGFR among our 93 discovery DMPs [11]. These findings support an association between DNAm and eGFR, reflecting a convergence of lifetime influences of genetic effects, lifestyle, behaviors, and environmental exposures [53,54,55].

Differences between ethnicities

Our approach of studying multi-ethnic groups identified DMPs from trans-ethnic and ethnic-specific analyses. This approach contrasts with a prior study that performed separate EA and AA analyses with cross-replication across these groups [11]. Across our two largest ethnic-specific samples (AA and EA), there was little overlap between DMPs from discovery analysis, although several discovery ethnic-identified DMPs did overlap with trans-ethnic findings (Fig. 1a). The DMP cg14871770, which is located between CYP2C9 and CYP2C19 (cytochrome P450 family 2 subfamily C members 9 and 19), was identified both in ethnic-specific and trans-ethnic analysis. The closest GWAS SNP to this DMP is rs4110517, a SNP associated with blood pressure identified in a multi-ethnic cohort [56]. CYP2C9 and CYP2C19 encode members of the cytochrome P450 superfamily of enzymes. Cytochrome P450 proteins are monooxygenases that catalyze many reactions involved in the synthesis of cholesterol, other steroids, and other lipids, and in drug metabolism. CYP2C19 genotype is associated with the metabolism of compounds influencing both renal function and hypertension [57]. The relevance of ethnic-specific findings to clinical phenotypes will need further evaluation in studies with larger samples inclusive of multiple ethnicities to define improved DNAm signatures for eGFR that may be unique to one single ancestry. Our findings suggest reduced utility of DNAm biomarkers for eGFR in diverse populations if discovery EWAS is performed in a single homogenous population.

Relevance of HSP90AA1 locus

Among other findings, we report a replicated trans-ethnic DMP (cg11789371) associated with eGFR that localizes to an intron of HSP90AA1 (heat shock protein 90 alpha family class A member 1), a gene expressed in podocytes, parietal epithelial cells, proximal tubular cells, endothelium, and mesangial cells in normal kidney tissue, with gene expression increasing in glomerulonephritis and acute kidney injury [46]. The gene product, HSP90, regulates renal blood flow and eGFR through nitric oxide metabolism and plays a role in protein folding [46]. HSP90 and other heat shock proteins are candidate drug targets for a variety of kidney diseases [46, 58]. Treatment with radicicol, an inhibitor of HSP90, has been shown to reduce eGFR in animal models [58]. DMP cg11789371 overlaps an accessible chromatin region in kidney cells, and a GeneHancer interacting site contacting the promoters of WDR20 and HSP90AA1. These and other different annotations point to a potential regulatory role in kidney tissue for this eGFR-associated DMP.

Integrative epigenomics and pathway analysis

We detected a significant overlap of eGFR GWAS SNP-associated meQTL CpGs with kidney DNase I hotspots. Additionally, our eGFR EWAS DMPs were enriched for kidney DNase I hotspots. These findings suggest potential links between the regulatory action of both genotypes and epigenetic DNAm elements. Importantly, these integrative epigenomic analyses considered all tissues available instead of only kidney tissue.

While some regions of the methylome show tissue specificity [59], EWAS have also shown that some DMPs are shared across different tissues, e.g., the AHRR locus DNAm patterns in response to smoking are shared across multiple tissues [60]. Pan-tissue findings for AHRR suggest similar underlying pathways in response to the same environmental stimulus [60]. Regarding discrepancies in trans-ethnic results, both genetic and environmental differences could be at play, potentially interacting with each other. In this context, our findings of both a kidney-specific DNase I hotspot and GWAS meQTL enrichment for a whole blood-based EWAS could be due to both genetic and environmental origins and warrant further research of kidney tissue DNAm in association with eGFR. Indeed, such results raise the intriguing hypothesis that tissue-specific enrichments observed separately in GWAS and EWAS might be related by the same genomic variants, thus aiding the integration of both approaches (Fig. 4a). It is important to highlight that both whole blood and kidney tissue eQTLs were obtained from individuals of European ethnicity [40,41,42,43].

The identification of eGFR DNAm signature-associated pathways is an important step towards characterizing epigenetic mechanisms for this physiologic trait and may provide clues to underlying mechanisms for CKD. Identified DMPs highlight an association with pathways of kidney development, which can influence nephron endowment at birth and subsequent CKD risk [48,49,50, 61]. Our in silico results were influenced by our DNAm findings in healthy adult kidney. Therefore, pathway results from this study support epigenetic effects during developmental windows with long-term influence on eGFR, which warrant further investigation.


This study is limited by the use of whole blood as the main tissue (chosen due to its availability). The discovery datasets included both the Illumina 450K and the EPIC 850K arrays, which contributed to differences in sample sizes and power to detect associations for some CpGs (Table 1). Post hoc power analyses suggest adequate power to detect DNAm differences of the range observed in the study (Additional file 2: Supplementary Methods). The methods for normalization of the DNAm beta values and study-specific quality control varied (Additional file 3: Table S2). However, we applied standardized protocols for data harmonization and statistical analyses in addition to stringent quality control as part of our meta-analyses. Our findings showed no heterogeneity of effects across studies (Additional file 1: Fig. S3), suggesting that results are robust to study-specific quality control and normalization procedures. Additionally, our reported DMPs replicated in independent samples, further validating our results. It is important to consider whether these DNAm sites associated with eGFR in blood are also applicable to effects in kidney tissue. We attempted to answer this by examining our identified meQTLs in normal kidney tissue in the TRANSLATE study. However, kidney tissue studies still have small sample sizes and lack ethnic diversity. While epigenomic mapping consortia such as Roadmap Epigenomics and ENCODE have made important steps to increase the free availability of a wide range of tissue and cell type-specific datasets, the important issue of including additional epigenomics mapping data sets for other ancestries remains to be addressed. It is important to highlight that we only observe eFORGE kidney enrichment for top EA probes. While this kidney-specific DNase I hotspot enrichment has been further confirmed by analyzing ranked DMPs in study p-value order (Fig. 3d), it is not apparent for DMPs from other ethnicities, or for DMPs for analysis comprising all ethnicities. DNase-seq datasets for these samples originate from the Roadmap Epigenomics consortium [34], which focused mainly on tissue samples obtained from EA individuals. Without datasets from diverse ethnic groups, it will be difficult to conclusively study inter-ethnic epigenomic variability or perform tissue-specific analyses for loci from GWAS and EWAS performed on individuals of non-European origin.


We identified trans-ethnic and ethnic-specific differential DNAm positions, validated prior published associations, and showed that several eGFR DMPs identified in this study replicated in independent samples. We have also shown that some of the DMPs are meQTL CpGs, many of which are associated with pathways relevant for kidney tissue regulation and development. Our findings include a DMP at HSP90AA1, a gene involved in the regulation of eGFR in kidney tissue. Identification of trans-ethnic and ethnic-specific DMPs and elucidation of their potential functional impact are preliminary steps towards identifying disease-associated epigenetic mechanisms that are specific to a particular population or shared across different populations.

Availability of data and materials

Investigators interested in retrieving the controlled-access data at dbGap for WHI-BAA23 and MESA should apply using identifiers phs000200.v10.p3 and phs001416.v1.p, respectively (available online at and The summary results from this study are placed in dbGaP with accession number phs000930.v8.p1. While awaiting data release via dbGaP, data are available at Access to the WHI-EMPC DNA methylation dataset is available upon request to DNA methylation datasets from JHS and HyperGEN have been recently generated and upload to dbGap is ongoing. JHS data are available on request from and HyperGEN data are available from the corresponding author from Ammous et al. [62]. Currently, JHS and WHI datasets are available through a scientific review application process directed to each respective study publication and presentation committee. Data can be obtained from the coordinating center of WHI and JHS after signing a data use agreement with the study. For research projects that meet the rules for access, CATHGEN data are available from the CATHGEN Steering Committee. Relevant requests are to be sent to the contact person at the CATHGEN Steering Committee, Melissa Hurdle ( According to the terms of consent for GS participants, access to individual-level data (omics and phenotypes) must be reviewed by the GS Access Committee. Applications should be made to The source code used for the eFORGE analyses is publicly available at [32] and [33].


  1. 1.

    GBD 2016 Causes of Death Collaborators. Global, regional, and national age-sex specific mortality for 264 causes of death, 1980–2016: a systematic analysis for the Global Burden of Disease Study 2016. Lancet. 2017;390:1151–210.

    Article  Google Scholar 

  2. 2.

    Saran R, Robinson B, Abbott KC, Bragg-Gresham J, Chen X, Gipson D, Gu H, Hirth RA, Hutton D, Jin Y, Kapke A, Kurtz V, Li Y, McCullough K, Modi Z, Morgenstern H, Mukhopadhyay P, Pearson J, Pisoni R, Repeck K, Schaubel DE, Shamraj R, Steffick D, Turf M, Woodside KJ, Xiang J, Yin M, Zhang X, Shahinian V. US renal data system 2019 annual data report: epidemiology of kidney disease in the United States. Am J Kidney Dis. 2020;75(1):A6–7.

    Article  Google Scholar 

  3. 3.

    Saran R, Robinson B, Abbott KC, Agodoa LYC, Bhave N, Bragg-Gresham J, Balkrishnan R, Dietrich X, Eckard A, Eggers PW, Gaipov A, Gillen D, Gipson D, Hailpern SM, Hall YN, Han Y, He K, Herman W, Heung M, Hirth RA, Hutton D, Jacobsen SJ, Jin Y, Kalantar-Zadeh K, Kapke A, Kovesdy CP, Lavallee D, Leslie J, McCullough K, Modi Z, Molnar MZ, Montez-Rath M, Moradi H, Morgenstern H, Mukhopadhyay P, Nallamothu B, Nguyen DV, Norris KC, O’Hare AM, Obi Y, Park C, Pearson J, Pisoni R, Potukuchi PK, Rao P, Repeck K, Rhee CM, Schrager J, Schaubel DE, Selewski DT, Shaw SF, Shi JM, Shieu M, Sim JJ, Soohoo M, Steffick D, Streja E, Sumida K, Tamura MK, Tilea A, Tong L, Wang D, Wang M, Woodside KJ, Xin X, Yin M, You AS, Zhou H, Shahinian V. US renal data system 2017 annual data report: epidemiology of kidney disease in the United States. Am J Kidney Dis. 2018;71(3):A7.

    Article  PubMed  PubMed Central  Google Scholar 

  4. 4.

    Morris AP, Le TH, Wu H, Akbarov A, van der Most PJ, Hemani G, et al. Trans-ethnic kidney function association study reveals putative causal genes and effects on kidney-specific disease aetiologies. Nat Commun. 2019;10(1):29.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  5. 5.

    Rakyan VK, Down TA, Balding DJ, Beck S. Epigenome-wide association studies for common human diseases. Nat Rev Genet. 2011;12(8):529–41.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  6. 6.

    Taudt A, Colomé-Tatché M, Johannes F. Genetic sources of population epigenomic variation. Nat Rev Genet. 2016;17(6):319–32.

    CAS  Article  PubMed  Google Scholar 

  7. 7.

    Qiu C, Hanson RL, Fufaa G, Kobes S, Gluck C, Huang J, Chen Y, Raj D, Nelson RG, Knowler WC, Susztak K. Cytosine methylation predicts renal function decline in American Indians. Kidney Int. 2018;93(6):1417–31.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  8. 8.

    Smyth LJ, McKay GJ, Maxwell AP, McKnight AJ. DNA hypermethylation and DNA hypomethylation is present at different loci in chronic kidney disease. Epigenetics. 2014;9(3):366–76.

    CAS  Article  PubMed  Google Scholar 

  9. 9.

    Wing MR, Devaney JM, Joffe MM, Xie D, Feldman HI, Dominic EA, Guzman NJ, Ramezani A, Susztak K, Herman JG, Cope L, Harmon B, Kwabi-Addo B, Gordish-Dressman H, Go AS, He J, Lash JP, Kusek JW, Raj DS, for the Chronic Renal Insufficiency Cohort (CRIC) Study. DNA methylation profile associated with rapid decline in kidney function: findings from the CRIC study. Nephrol Dial Transplant. 2014;29(4):864–72.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  10. 10.

    Richard MA, Huan T, Ligthart S, Gondalia R, Jhun MA, Brody JA, Irvin MR, Marioni R, Shen J, Tsai PC, Montasser ME, Jia Y, Syme C, Salfati EL, Boerwinkle E, Guan W, Mosley TH Jr, Bressler J, Morrison AC, Liu C, Mendelson MM, Uitterlinden AG, van Meurs JB, Franco OH, Zhang G, Li Y, Stewart JD, Bis JC, Psaty BM, Chen YDI, Kardia SLR, Zhao W, Turner ST, Absher D, Aslibekyan S, Starr JM, McRae AF, Hou L, Just AC, Schwartz JD, Vokonas PS, Menni C, Spector TD, Shuldiner A, Damcott CM, Rotter JI, Palmas W, Liu Y, Paus T, Horvath S, O’Connell JR, Guo X, Pausova Z, Assimes TL, Sotoodehnia N, Smith JA, Arnett DK, Deary IJ, Baccarelli AA, Bell JT, Whitsel E, Dehghan A, Levy D, Fornage M, Heijmans BT, ’t Hoen PAC, van Meurs J, Isaacs A, Jansen R, Franke L, Boomsma DI, Pool R, van Dongen J, Hottenga JJ, van Greevenbroek MMJ, Stehouwer CDA, van der Kallen CJH, Schalkwijk CG, Wijmenga C, Zhernakova A, Tigchelaar EF, Slagboom PE, Beekman M, Deelen J, van Heemst D, Veldink JH, van den Berg LH, van Duijn CM, Hofman A, Uitterlinden AG, Jhamai PM, Verbiest M, Suchiman HED, Verkerk M, van der Breggen R, van Rooij J, Lakenberg N, Mei H, van Iterson M, van Galen M, Bot J, van ’t Hof P, Deelen P, Nooren I, Moed M, Vermaat M, Zhernakova DV, Luijk R, Bonder MJ, van Dijk F, Arindrarto W, Kielbasa SM, Swertz MA, van Zwet EW. DNA methylation analysis identifies loci for blood pressure regulation. Am J Hum Genet. 2017;101(6):888–902.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  11. 11.

    Chu AY, Tin A, Schlosser P, Ko Y-A, Qiu C, Yao C, Joehanes R, Grams ME, Liang L, Gluck CA, Liu C, Coresh J, Hwang SJ, Levy D, Boerwinkle E, Pankow JS, Yang Q, Fornage M, Fox CS, Susztak K, Köttgen A. Epigenome-wide association studies identify DNA methylation associated with kidney function. Nat Commun. 2017;8(1):1286.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  12. 12.

    Sikdar S, Joehanes R, Joubert BR, Xu C-J, Vives-Usano M, Rezwan FI, Felix JF, Ward JM, Guan W, Richmond RC, Brody JA, Küpers LK, Baïz N, Håberg SE, Smith JA, Reese SE, Aslibekyan S, Hoyo C, Dhingra R, Markunas CA, Xu T, Reynolds LM, Just AC, Mandaviya PR, Ghantous A, Bennett BD, Wang T, Consortium TBIOS, Bakulski KM, Melen E, Zhao S, Jin J, Herceg Z, Meurs J, Taylor JA, Baccarelli AA, Murphy SK, Liu Y, Munthe-Kaas MC, Deary IJ, Nystad W, Waldenberger M, Annesi-Maesano I, Conneely K, Jaddoe VWV, Arnett D, Snieder H, Kardia SLR, Relton CL, Ong KK, Ewart S, Moreno-Macias H, Romieu I, Sotoodehnia N, Fornage M, Motsinger-Reif A, Koppelman GH, Bustamante M, Levy D, London SJ Comparison of smoking-related DNA methylation between newborns from prenatal exposure and adults from personal smoking. Epigenomics. 2019;11:1487–1500, 13, doi:

  13. 13.

    Design of the Women’s Health Initiative clinical trial and observational study. The Women’s Health Initiative study group. Control Clin Trials. 1998;19:61–109.

    Article  Google Scholar 

  14. 14.

    Anderson GL, Manson J, Wallace R, Lund B, Hall D, Davis S, Shumaker S, Wang CY, Stein E, Prentice RL. Implementation of the Women’s Health Initiative study design. Ann Epidemiol. 2003;13(9):S5–17.

    Article  PubMed  Google Scholar 

  15. 15.

    Howard BV, Van Horn L, Hsia J, Manson JE, Stefanick ML, Wassertheil-Smoller S, et al. Low-fat dietary pattern and risk of cardiovascular disease: the Women’s Health Initiative Randomized Controlled Dietary Modification Trial. JAMA. 2006;295(6):655–66.

    CAS  Article  PubMed  Google Scholar 

  16. 16.

    Jackson RD, LaCroix AZ, Gass M, Wallace RB, Robbins J, Lewis CE, et al. Calcium plus vitamin D supplementation and the risk of fractures. N Engl J Med. 2006;354(7):669–83.

    CAS  Article  PubMed  Google Scholar 

  17. 17.

    Bild DE, Bluemke DA, Burke GL, Detrano R, Diez Roux AV, Folsom AR, Greenland P, Jacob DR Jr, Kronmal R, Liu K, Nelson JC, O'Leary D, Saad MF, Shea S, Szklo M, Tracy RP. Multi-ethnic study of atherosclerosis: objectives and design. Am J Epidemiol. 2002;156(9):871–81.

    Article  PubMed  Google Scholar 

  18. 18.

    Taylor HA, Wilson JG, Jones DW, Sarpong DF, Srinivasan A, Garrison RJ, et al. Toward resolution of cardiovascular health disparities in African Americans: design and methods of the Jackson Heart Study. Ethn Dis. 2005;15:S6–4–17.

    PubMed  Google Scholar 

  19. 19.

    Akinyemiju T, Do AN, Patki A, Aslibekyan S, Zhi D, Hidalgo B, Tiwari HK, Absher D, Geng X, Arnett DK, Irvin MR. Epigenome-wide association study of metabolic syndrome in African-American adults. Clin Epigenetics. 2018;10(1):49.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  20. 20.

    Smith BH, Campbell A, Linksted P, Fitzpatrick B, Jackson C, Kerr SM, Deary IJ, MacIntyre DJ, Campbell H, McGilchrist M, Hocking LJ, Wisely L, Ford I, Lindsay RS, Morton R, Palmer CNA, Dominiczak AF, Porteous DJ, Morris AD. Cohort Profile: Generation Scotland: Scottish Family Health Study (GS:SFHS). The study, its participants and their potential for genetic research on health and illness. Int J Epidemiol. 2013;42(3):689–700.

    Article  PubMed  Google Scholar 

  21. 21.

    Kraus WE, Granger CB, Sketch MH, Donahue MP, Ginsburg GS, Hauser ER, et al. A guide for a cardiovascular genomics biorepository: the CATHGEN experience. J Cardiovasc Transl Res. 2015;8(8):449–57.

    Article  PubMed  PubMed Central  Google Scholar 

  22. 22.

    Inker LA, Schmid CH, Tighiouart H, Eckfeldt JH, Feldman HI, Greene T, Kusek JW, Manzi J, van Lente F, Zhang YL, Coresh J, Levey AS, CKD-EPI Investigators. Estimating glomerular filtration rate from serum creatinine and cystatin C. N Engl J Med. 2012;367(1):20–9.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  23. 23.

    Teschendorff AE, Marabita F, Lechner M, Bartlett T, Tegner J, Gomez-Cabrero D, Beck S. A beta-mixture quantile normalization method for correcting probe design bias in Illumina Infinium 450 k DNA methylation data. Bioinformatics. 2013;29(2):189–96.

    CAS  Article  PubMed  Google Scholar 

  24. 24.

    Fortin J-P, Triche TJ, Hansen KD. Preprocessing, normalization and integration of the Illumina HumanMethylationEPIC array with minfi. Bioinformatics. 2017;33(4):558–60.

    CAS  Article  PubMed  Google Scholar 

  25. 25.

    Touleimat N, Tost J. Complete pipeline for Infinium(®) Human Methylation 450K BeadChip data processing using subset quantile normalization for accurate DNA methylation estimation. Epigenomics. 2012;4(3):325–41.

    CAS  Article  PubMed  Google Scholar 

  26. 26.

    Pidsley R, Y Wong CC, Volta M, Lunnon K, Mill J, Schalkwyk LC. A data-driven approach to preprocessing Illumina 450K methylation array data. BMC Genomics 2013;14:293, 1, doi:

  27. 27.

    Johnson WE, Li C, Rabinovic A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics. 2007;8(1):118–27.

    Article  PubMed  Google Scholar 

  28. 28.

    Chen Y, Lemire M, Choufani S, Butcher DT, Grafodatskaya D, Zanke BW, Gallinger S, Hudson TJ, Weksberg R. Discovery of cross-reactive probes and polymorphic CpGs in the Illumina Infinium HumanMethylation450 microarray. Epigenetics. 2013;8(2):203–9.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  29. 29.

    Houseman EA, Accomando WP, Koestler DC, Christensen BC, Marsit CJ, Nelson HH, Wiencke JK, Kelsey KT. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinformatics. 2012;13(1):86.

    Article  PubMed  PubMed Central  Google Scholar 

  30. 30.

    Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006;38(8):904–9.

    CAS  Article  PubMed  Google Scholar 

  31. 31.

    Bibikova M, Barnes B, Tsan C, Ho V, Klotzle B, Le JM, et al. High density DNA methylation array with single CpG site resolution. Genomics. 2011;98(4):288–95.

    CAS  Article  PubMed  Google Scholar 

  32. 32.

    Breeze CE, Paul DS, van Dongen J, Butcher LM, Ambrose JC, Barrett JE, Lowe R, Rakyan VK, Iotchkova V, Frontini M, Downes K, Ouwehand WH, Laperle J, Jacques PÉ, Bourque G, Bergmann AK, Siebert R, Vellenga E, Saeed S, Matarese F, Martens JHA, Stunnenberg HG, Teschendorff AE, Herrero J, Birney E, Dunham I, Beck S. eFORGE: a tool for identifying cell type-specific signal in epigenomic data. Cell Rep. 2016;17(8):2137–50.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  33. 33.

    Breeze CE, Reynolds AP, van Dongen J, Dunham I, Lazar J, Neph S, et al. eFORGE v2.0: updated analysis of cell type-specific signal in epigenomic data. Bioinformatics. 2019;35:4767–9.

    CAS  Article  Google Scholar 

  34. 34.

    Roadmap Epigenomics Consortium, Kundaje A, Meuleman W, Ernst J, Bilenky M, Yen A, et al. Integrative analysis of 111 reference human epigenomes. Nature. 2015;518:317–30.

    Article  Google Scholar 

  35. 35.

    Mi H, Muruganujan A, Casagrande JT, Thomas PD. Large-scale gene function analysis with the PANTHER classification system. Nat Protoc. 2013;8(8):1551–66.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  36. 36.

    Bonder MJ, Luijk R, Zhernakova DV, Moed M, Deelen P, Vermaat M, et al. Disease variants alter transcription factor levels and methylation of their binding sites. Nat Genet. 2017;49(1):131–8.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  37. 37.

    Zhernakova DV, Deelen P, Vermaat M, van Iterson M, van Galen M, Arindrarto W, van 't Hof P, Mei H, van Dijk F, Westra HJ, Bonder MJ, van Rooij J, Verkerk M, Jhamai PM, Moed M, Kielbasa SM, Bot J, Nooren I, Pool R, van Dongen J, Hottenga JJ, Stehouwer CDA, van der Kallen CJH, Schalkwijk CG, Zhernakova A, Li Y, Tigchelaar EF, de Klein N, Beekman M, Deelen J, van Heemst D, van den Berg LH, Hofman A, Uitterlinden AG, van Greevenbroek MMJ, Veldink JH, Boomsma DI, van Duijn CM, Wijmenga C, Slagboom PE, Swertz MA, Isaacs A, van Meurs JBJ, Jansen R, Heijmans BT, 't Hoen PAC, Franke L Identification of context-dependent expression quantitative trait loci in whole blood. Nat Genet 2017;49:139–145, 1, doi:

  38. 38.

    Huan T, Joehanes R, Song C, Peng F, Guo Y, Mendelson M, Yao C, Liu C, Ma J, Richard M, Agha G, Guan W, Almli LM, Conneely KN, Keefe J, Hwang SJ, Johnson AD, Fornage M, Liang L, Levy D. Genome-wide identification of DNA methylation QTLs in whole blood highlights pathways for cardiovascular disease. Nat Commun. 2019;10(1):4267.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  39. 39.

    Gaunt TR, Shihab HA, Hemani G, Min JL, Woodward G, Lyttleton O, Zheng J, Duggirala A, McArdle WL, Ho K, Ring SM, Evans DM, Davey Smith G, Relton CL. Systematic identification of genetic influences on methylation across the human life course. Genome Biol. 2016;17(1):61.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  40. 40.

    Xu X, Eales JM, Akbarov A, Guo H, Becker L, Talavera D, Ashraf F, Nawaz J, Pramanik S, Bowes J, Jiang X, Dormer J, Denniff M, Antczak A, Szulinska M, Wise I, Prestes PR, Glyda M, Bogdanski P, Zukowska-Szczechowska E, Berzuini C, Woolf AS, Samani NJ, Charchar FJ, Tomaszewski M. Molecular insights into genome-wide association studies of chronic kidney disease-defining traits. Nat Commun. 2018;9(1):4800.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  41. 41.

    Cancer Genome Atlas Research Network. Comprehensive molecular characterization of clear cell renal cell carcinoma. Nature. 2013;499(7456):43–9.

    CAS  Article  Google Scholar 

  42. 42.

    Rowland J, Akbarov A, Eales J, Xu X, Dormer JP, Guo H, Denniff M, Jiang X, Ranjzad P, Nazgiewicz A, Prestes PR, Antczak A, Szulinska M, Wise IA, Zukowska-Szczechowska E, Bogdanski P, Woolf AS, Samani NJ, Charchar FJ, Tomaszewski M. Uncovering genetic mechanisms of kidney aging through transcriptomics, genomics, and epigenomics. Kidney Int. 2019;95(3):624–35.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  43. 43.

    Tomaszewski M, Eales J, Denniff M, Myers S, Chew GS, Nelson CP, Christofidou P, Desai A, Büsst C, Wojnar L, Musialik K, Jozwiak J, Debiec R, Dominiczak AF, Navis G, van Gilst WH, van der Harst P, Samani NJ, Harrap S, Bogdanski P, Zukowska-Szczechowska E, Charchar FJ. Renal mechanisms of association between fibroblast growth factor 1 and blood pressure. J Am Soc Nephrol. 2015;26(12):3151–60.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  44. 44.

    Hellwege JN, Velez Edwards DR, Giri A, Qiu C, Park J, Torstenson ES, Keaton JM, Wilson OD, Robinson-Cohen C, Chung CP, Roumie CL, Klarin D, Damrauer SM, DuVall SL, Siew E, Akwo EA, Wuttke M, Gorski M, Li M, Li Y, Gaziano JM, Wilson PWF, Tsao PS, O’Donnell CJ, Kovesdy CP, Pattaro C, Köttgen A, Susztak K, Edwards TL, Hung AM. Mapping eGFR loci to the renal transcriptome and phenome in the VA million veteran program. Nat Commun. 2019;10(1):3842.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  45. 45.

    Ongen H, Buil A, Brown AA, Dermitzakis ET, Delaneau O. Fast and efficient QTL mapper for thousands of molecular phenotypes. Bioinformatics. 2016;32(10):1479–85.

    CAS  Article  PubMed  Google Scholar 

  46. 46.

    Chebotareva N, Bobkova I, Shilov E. Heat shock proteins and kidney disease: perspectives of HSP therapy. Cell Stress Chaperones. 2017;22(3):319–43.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  47. 47.

    Wuttke M, Li Y, Li M, Sieber KB, Feitosa MF, Gorski M, et al. A catalog of genetic loci associated with kidney function from analyses of a million individuals. Nat Genet. 2019;51(6):957–72.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  48. 48.

    Xu J, Liu H, Chai OH, Lan Y, Jiang R. Osr1 interacts synergistically with Wt1 to regulate kidney organogenesis. PLoS One. 2016;11(7):e0159597.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  49. 49.

    Jiang H, Li L, Yang H, Bai Y, Jiang H, Li Y. Pax2 may play a role in kidney development by regulating the expression of TBX1. Mol Biol Rep. 2014;41(11):7491–8.

    CAS  Article  PubMed  Google Scholar 

  50. 50.

    Alam-Faruque Y, Hill DP, Dimmer EC, Harris MA, Foulger RE, Tweedie S, Attrill H, Howe DG, Thomas SR, Davidson D, Woolf AS, Blake JA, Mungall CJ, O’Donovan C, Apweiler R, Huntley RP. Representing kidney development using the gene ontology. Plos One. 2014;9(6):e99864.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  51. 51.

    Welter D, MacArthur J, Morales J, Burdett T, Hall P, Junkins H, Klemm A, Flicek P, Manolio T, Hindorff L, Parkinson H. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 2014;42(D1):D1001–6.

    CAS  Article  PubMed  Google Scholar 

  52. 52.

    Chen J, Huang Y, Hui Q, Mathur R, Gwinn M, So-Armah K, Freiberg MS, Justice AC, Xu K, Marconi VC, Sun YV. Epigenetic associations with estimated glomerular filtration rate among men with human immunodeficiency virus infection. Clin Infect Dis. 2020;70(4):667–73.

    CAS  Article  PubMed  Google Scholar 

  53. 53.

    Collins AJ, Foley RN, Gilbertson DT, Chen S-C. United States Renal Data System public health surveillance of chronic kidney disease and end-stage renal disease. Kidney Int Suppl (2011). 2015;5:2–7.

    Article  Google Scholar 

  54. 54.

    Xue JL, Eggers PW, Agodoa LY, Foley RN, Collins AJ. Longitudinal study of racial and ethnic differences in developing end-stage renal disease among aged medicare beneficiaries. J Am Soc Nephrol. 2007;18(4):1299–306.

    Article  PubMed  Google Scholar 

  55. 55.

    Collins AJ, Foley RN, Herzog C, Chavers B, Gilbertson D, Herzog C, et al. US Renal Data System 2012 annual data report. Am J Kidney Dis. 2013;61(A7):e1–476.

    Google Scholar 

  56. 56.

    Hoffmann TJ, Ehret GB, Nandakumar P, Ranatunga D, Schaefer C, Kwok P-Y, Iribarren C, Chakravarti A, Risch N. Genome-wide association analyses using electronic health records identify new loci influencing blood pressure variation. Nat Genet. 2017;49(1):54–64.

    CAS  Article  PubMed  Google Scholar 

  57. 57.

    Imamura CK, Furihata K, Okamoto S, Tanigawara Y. Impact of cytochrome P450 2C19 polymorphisms on the pharmacokinetics of tacrolimus when coadministered with voriconazole. J Clin Pharmacol. 2016;56(4):408–13.

    CAS  Article  PubMed  Google Scholar 

  58. 58.

    Ramírez V, Mejía-Vilet JM, Hernández D, Gamba G, Bobadilla NA. Radicicol, a heat shock protein 90 inhibitor, reduces glomerular filtration rate. Am J Physiol Renal Physiol. 2008;295(4):F1044–51.

    CAS  Article  PubMed  Google Scholar 

  59. 59.

    Lowe R, Slodkowicz G, Goldman N, Rakyan VK. The human blood DNA methylome displays a highly distinctive profile compared with other somatic tissues. Epigenetics. 2015;0, 10, 4, 274, 281, doi:

  60. 60.

    Tsai P-C, Glastonbury CA, Eliot MN, Bollepalli S, Yet I, Castillo-Fernandez JE, Carnero-Montoro E, Hardiman T, Martin TC, Vickers A, Mangino M, Ward K, Pietiläinen KH, Deloukas P, Spector TD, Viñuela A, Loucks EB, Ollikainen M, Kelsey KT, Small KS, Bell JT. Smoking induces coordinated DNA methylation and gene expression changes in adipose tissue with consequences for metabolic health. Clin Epigenetics. 2018;10(1):126.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  61. 61.

    Barker DJP, Bagby SP, Hanson MA. Mechanisms of disease: in utero programming in the pathogenesis of hypertension. Nat Clin Pract Nephrol. 2006;2(12):700–7.

    Article  PubMed  Google Scholar 

  62. 62.

    Ammous F, Zhao W, Ratliff SM, Kho M, Shang L, Jones AC, Chaudhary NS, Tiwari HK, Irvin MR, Arnett DK, Mosley TH, Bielak LF, Kardia SLR, Zhou X, Smith J. Epigenome-wide association study identifies DNA methylation sites associated with target organ damage in older African Americans. Epigenetics. 2020:1–14.

Download references


The authors thank the CATHGEN Steering Committee, the faculty and staff of the Duke cardiac catheterization lab, and the CATHGEN participants for making this work possible. We are grateful to all the families who took part, the general practitioners and the Scottish School of Primary Care for their help in recruiting them, and the whole Generation Scotland team that includes interviewers, computer and laboratory technicians, clerical workers, research scientists, volunteers, managers, receptionists, healthcare assistants, and nurses. The authors also wish to thank the staff and participants of the JHS. The authors thank the WHI investigators and staff for their dedication and the study participants for making the program possible.


This study was supported by the National Institutes of Health R01- MD012765, R01-DK117445, and R21-HL140385 to NF. The TRANSLATE studies and MT are supported by British Heart Foundation project grants PG/17/35/33001 and PG/19/16/34270, Kidney Research UK grant RP_017_20180302, and Medical University of Silesia grants KNW-1-152/N/7/K and KNW-1-171/N/6/K. WT206194. DLM and REM are supported by Alzheimer’s Research UK major project grant ARUK-PG2017B-10. CH is supported by an MRC University Unit Programme Grant MC_UU_00007/10 (QTL in Health and Disease). SJL and MKL are supported by the Intramural Program of the NIH, National Institute of Environmental Health Sciences (ZO1 ES043012). SB acknowledges funding from the Wellcome Trust (218274/Z/19/Z). CEB and SIB are supported by the Intramural Research Program of the Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH.

The CATHGEN study was supported by NIH grants HL095987 and HL036587.

Generation Scotland (GS) received core support from the Chief Scientist Office of the Scottish Government Health Directorates (CZD/16/6) and the Scottish Funding Council (HR03006). Genotyping and DNA methylation profiling of the GS samples was carried out by the Genetics Core Laboratory at the Clinical Research Facility, University of Edinburgh, Edinburgh, Scotland, and was funded by the Medical Research Council UK and the Wellcome Trust (Wellcome Trust Strategic Award “STratifying Resilience and Depression Longitudinally” (STRADL; Reference 104036/Z/14/Z)).

Support for the Multi-Ethnic Study of Atherosclerosis (MESA) projects are conducted and supported by the National Heart, Lung, and Blood Institute (NHLBI) in collaboration with MESA investigators. Support for MESA is provided by contracts 75N92020D00001, HHSN268201500003I, N01-HC-95159, 75N92020D00005, N01-HC-95160, 75N92020D00002, N01-HC-95161, 75N92020D00003, N01-HC-95162, 75N92020D00006, N01-HC-95163, 75N92020D00004, N01-HC-95164, 75N92020D00007, N01-HC-95165, N01-HC-95166, N01-HC-95167, N01-HC-95168, N01-HC-95169, UL1-TR-000040, UL1-TR-001079, and UL1-TR-001420. Also supported in part by the National Center for Advancing Translational Sciences, CTSI grant UL1TR001881, and the National Institute of Diabetes and Digestive and Kidney Disease Diabetes Research Center (DRC) grant DK063491 to the Southern California Diabetes Endocrinology Research Center. The TOPMed MESA Multi-Omics project was conducted by the University of Washington and LABioMed (HHSN2682015000031/HHSN26800004).

The Jackson Heart Study (JHS) is supported and conducted in collaboration with Jackson State University (HHSN268201800013I), Tougaloo College (HHSN268201800014I), the Mississippi State Department of Health (HHSN268201800015I), and the University of Mississippi Medical Center (HHSN268201800010I, HHSN268201800011I, and HHSN268201800012I) contracts from the National Heart, Lung, and Blood Institute (NHLBI) and the National Institute for Minority Health and Health Disparities (NIMHD). The views expressed in this manuscript are those of the authors and do not necessarily represent the views of the National Heart, Lung, and Blood Institute; the National Institutes of Health; or the US Department of Health and Human Services. The funders had no role in the design and conduct of the study; in the collection, analysis, and interpretation of the data; and in the preparation, review, or approval of the manuscript.

The WHI program is funded by the National Heart, Lung, and Blood Institute, National Institutes of Health, US Department of Health and Human Services, through contracts HHSN268201600018C, HHSN268201600001C, HHSN268201600002C, HHSN268201600003C, and HHSN268201600004C. A full listing of WHI investigators can be found at”.

The Hypertension Genetic Epidemiology Network (HyperGEN) Study is part of the National Heart, Lung, and Blood Institute (NHLBI) Family Blood Pressure Program; collection of the data represented here was supported by grants U01 HL054472, U01 HL054473, U01 HL054495, and U01 HL054509. The HyperGEN: Genetics of Left Ventricular Hypertrophy Study was supported by NHLBI grant R01 HL055673 with whole-genome sequencing made possible by supplement -18S1. The epigenetic data in HyperGEN was funded by an American Heart Association Cardiovascular Genome-Phenome Study Pathway Grant Award 15GPSPG23890000.

Author information





Advised or performed statistical analyses (CEB, AB, MKL, JCM, MDS, RJ, DLM, XX, AP, JME, APM, REM, SB, SIB, SJL). Study design (CEB, SJL, NF).

Generated data (YL, RPT, DVDB, EAW, LH, HJK, LR, LL, EL, PD, KE, WEK, SS, HKT, XJ, FJC, AAB, SSR, MI, DKA, ERH, JIR, AC, CH, SH, MT). CEB and NF wrote the paper with contributions from all other authors. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Charles E. Breeze or Nora Franceschini.

Ethics declarations

Ethics approval and consent to participate

All participants provided informed written consent to participate in these studies, and studies were approved by the Institutional Review Boards at each recruiting site.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Supplementary Figures

. Figure S1 to S7. Study design, QQ plots, Manhattan plots, forest plots and eFORGE/FORGE2 analyses.

Additional file 2: Supplementary Methods

. Description of different constituent cohorts and studies, in addition to power analyses.

Additional file 3: Supplementary Tables

. Tables S1 to S5. Descriptive characteristics, quality control and processing, and results.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Breeze, C.E., Batorsky, A., Lee, M.K. et al. Epigenome-wide association study of kidney function identifies trans-ethnic and ethnic-specific loci. Genome Med 13, 74 (2021).

Download citation


  • Epigenetic
  • Kidney function
  • Gene regulation
  • Kidney development
  • DNA methylation