- Open Access
Elevated polygenic burden for autism is associated with differential DNA methylation at birth
- Eilis Hannon1,
- Diana Schendel2,
- Christine Ladd-Acosta3, 4,
- Jakob Grove5, 6, 7, 8,
- iPSYCH-Broad ASD Group,
- Christine Søholm Hansen6, 9, 10,
- Shan V. Andrews3, 4,
- David Michael Hougaard6, 9,
- Michaeline Bresnahan11,
- Ole Mors6, 13,
- Mads Vilhelm Hollegaard^6, 9,
- Marie Bækvad-Hansen6, 9,
- Mady Hornig11, 12,
- Preben Bo Mortensen6, 14, 15, 16,
- Anders D. Børglum5, 6, 7,
- Thomas Werge6, 10, 17,
- Marianne Giørtz Pedersen6, 13, 16,
- Merete Nordentoft6, 18,
- Joseph Buxbaum19,
- M. Daniele Fallin4, 11, 20, 21,
- Jonas Bybjerg-Grauholm6, 9,
- Abraham Reichenberg19 and
- Jonathan Mill1Email authorView ORCID ID profile
© The Author(s). 2018
- Received: 13 November 2017
- Accepted: 20 February 2018
- Published: 28 March 2018
Autism spectrum disorder (ASD) is a severe neurodevelopmental disorder characterized by deficits in social communication and restricted, repetitive behaviors, interests, or activities. The etiology of ASD involves both inherited and environmental risk factors, with epigenetic processes hypothesized as one mechanism by which both genetic and non-genetic variation influence gene regulation and pathogenesis. The aim of this study was to identify DNA methylation biomarkers of ASD detectable at birth.
We quantified neonatal methylomic variation in 1263 infants—of whom ~ 50% went on to subsequently develop ASD—using DNA isolated from archived blood spots taken shortly after birth. We used matched genotype data from the same individuals to examine the molecular consequences of ASD-associated genetic risk variants, identifying methylomic variation associated with elevated polygenic burden for ASD. In addition, we performed DNA methylation quantitative trait loci (mQTL) mapping to prioritize target genes from ASD GWAS findings.
We identified robust epigenetic signatures of gestational age and prenatal tobacco exposure, confirming the utility of DNA methylation data generated from neonatal blood spots. Although we did not identify specific loci showing robust differences in neonatal DNA methylation associated with later ASD, there was a significant association between increased polygenic burden for autism and methylomic variation at specific loci. Each unit of elevated ASD polygenic risk score was associated with a mean increase in DNA methylation of − 0.14% at two CpG sites located proximal to a robust GWAS signal for ASD on chromosome 8.
This study is the largest analysis of DNA methylation in ASD undertaken and the first to integrate genetic and epigenetic variation at birth. We demonstrate the utility of using a polygenic risk score to identify molecular variation associated with disease, and of using mQTL to refine the functional and regulatory variation associated with ASD risk variants.
- DNA methylation
- Genome-wide association study (GWAS)
- Epigenome-wide association study (EWAS)
- DNA methylation quantitative trait loci (mQTL)
- Polygenic risk score
- Prenatal smoking
Autism spectrum disorder (ASD) defines a group of complex neurodevelopmental disorders marked by deficits in social communication and restricted, repetitive behaviors, interests, or activities . ASD affects ~ 1–2% of the population, and confers severe lifelong disability [2–4]. Quantitative genetic studies indicate that ASD is highly heritable [5, 6], although population-based epidemiologic studies of environmental risks and ASD liability modeling using family designs also indicate environmental factors as important . Genetic studies have shown that autism risk is strongly associated with both rare inherited and de novo DNA sequence variants [8–11]. In contrast, the identification of common genetic variants associated with ASD using genome-wide association studies (GWAS) has proven harder than for other complex neuropsychiatric traits such as schizophrenia , at least in part due to a lack of large sample datasets. Recent collaboration between the Psychiatrics Genomics Consortium autism workgroup (PGC-AUT) and the Lundbeck Foundation Initiative for Integrative Psychiatric Research (iPSYCH) has greatly expanded the number of ASD cases with GWAS data, enabling the identification of three genome-wide significant associations for ASD and evidence for a substantial polygenic component in signals falling below the stringent genome-wide significance threshold . None of the three ASD-associated loci are predicted to result in coding changes or altered protein structure; instead they are hypothesized to influence gene regulation. Previous studies of other neurodevelopmental disorders have reported an enrichment of disease-associated variation in regulatory domains, including enhancers and regions of open chromatin .
Epigenetic variation induced by non-genetic exposures has been hypothesized to be one mechanism by which environmental factors can affect risk for ASD [15, 16]. Recent studies have provided initial evidence for autism-associated epigenetic variation in both brain and peripheral tissues [17–22], although these analyses have been undertaken on relatively small numbers of samples with limited statistical power. Existing analyses have assessed epigenetic variation in samples collected after a diagnosis of ASD has been assigned and are likely to be confounded by factors such as smoking [23–25], medication [26, 27], other environmental toxins , and reverse causation . Furthermore, they have not investigated the role of genetic variation in mediating associations between epigenetic variation and ASD. The integration of genetic and epigenetic data will facilitate a better understanding of the molecular mechanisms involved in autism, especially given the high heritability of ASD and recent data showing how the epigenome can be directly influenced by genetic variation [30–33]. For example, we have previously demonstrated the potential for using polygenic risk scores (PRS)—defined as the sum of trait-associated alleles across many genetic loci, weighted by GWAS effect sizes—as disease biomarkers with utility for exploring the molecular genomic mechanisms involved in disease pathogenesis . Of note, PRS-associated epigenetic variation is potentially less affected by factors associated with the disease itself, which can confound case–control analyses.
In this study, we quantified DNA methylation for ~ 1316 individuals (comprising equal numbers of ASD cases and matched controls, 50% male/female) using DNA samples isolated from neonatal blood spots collected proximal to birth (mean = 6.08 days; standard deviation (sd) = 3.24 days; Additional file 1: Figure S1). Known epigenetic signatures for gestational and chronological age [35, 36], and exposure to maternal smoking during pregnancy , were used to confirm the robust nature of genome-wide DNA methylation data generated from neonatal blood spots. Matched genome-wide single nucleotide polymorphism (SNP) genotyping data from the same individuals enabled us to undertake an integrated genetic–epigenetic analysis of ASD, exploring the extent to which neonatal methylomic variation at birth is associated with elevated polygenic burden for ASD. Finally, we generated an extensive database of DNA methylation quantitative trait loci (mQTL) in neonatal blood samples, which were used to characterize the molecular consequences of genetic variants associated with ASD.
Overview of the MINERvA cohort
Denmark has a comprehensive neonatal screening program which is used to test for innate errors of metabolism, hypothyroidism, and other treatable disorders. Neonatal blood is collected on standard Guthrie cards and residual material is stored within the Danish Neonatal Screening Biobank. The reason for storing the samples in prioritized order is: (1) diagnosis and treatment of congenital disorders, (2) diagnostic use later in infancy after informed consent, (3) legal use after court order, (4) research projects pending approval by the Scientific Ethical Committee System in Denmark, The Danish Data Protection Agency, and the NBS-Biobank Steering Committee. Thus, research is possible assuming sufficient material remains for the proceeding priorities . Cases and controls were selected from the iPSYCH case–control sample, which has been recently described . Briefly, the iPSYCH study population comprises all singletons born in Denmark between May 1st 1981 and December 31st 2005, who are still alive and residing in Denmark at their first birthday and with a known mother. iPSYCH ASD cases comprise all children in the study population with an ASD diagnosis reported before December 31st 2012. iPSYCH controls comprise 30,000 persons randomly selected from the study population (about 2% of the total study population).
The MINERvA study profiled a subsample of 1316 iPSYCH samples, including an equal number of ASD cases and controls that were selected using the following criteria. Cases were born between 1998 and 2002, with both parents born in Denmark themselves. We selected a 1:1 male to female ratio (i.e., by “oversampling” ASD females). Cases and controls were excluded if they had a reported diagnosis (before December 31st 2012) of select known genetic disorders: Down syndrome, Fragile X, Angelman, Prader Willi, Zellweger, William, tuberous sclerosis, Rett, Tourette, neurofibromatosis, Duchennes, Cornelia de Lange, DiGeorge, Smith-Lemli-Opitz, Klinfelter. In addition, controls were excluded if they had died or emigrated from Denmark before December 31st 2012, or had any reported psychiatric diagnosis. Eligible controls were individually matched to cases on sex, month of birth (month before, same month, or month after case month), and year of birth. Among the controls fulfilling these criteria, additional matching criteria were applied as closely as possible with regard to gestational age (in weeks) and the same urbanicity level of maternal residence at time of birth as cases. All perinatal data used for case–control matching, plus additional information on birth weight and maternal smoking were obtained from the Danish Medical Birth Register or the Central Person Register. Detailed maternal smoking data were used to generate a binary variable indicating whether the mother smoked during pregnancy or not. All diagnoses used for ASD case identification and case/control exclusions were obtained from the Danish Psychiatric Central Research Register (DPCRR) and Danish National Patient Register (DNPR). In Denmark, children and adolescents suspected of ASD or other mental or behavioral disorders are referred by general practitioners or school psychologists to a child and adolescent psychiatric department for a multidisciplinary evaluation, and their conditions are diagnosed by a child and adolescent psychiatrist. Registry reporting is done only by psychiatrists following mandatory training in the use of the World Health Organization International Classification of Diseases (ICD) . The following ICD-10 diagnosis codes were used: ASD, F84.0, F84.1, F84.5, F84.8, F84.9; any psychiatric disorder, F00–F99. Reported diagnoses for the conditions used as exclusions were obtained from the DNPR, which holds all data on in- and out-patient diagnoses given at discharge from somatic wards in all hospitals and clinics since 1995 . Additional file 2: Table S1 gives a full overview of relevant diagnosis codes. The MINERvA study was approved by the Regional Scientific Ethics Committee in Denmark and the Danish Data Protection Agency.
DNA methylation profiling in MINERvA
Neonatal dried blood spot samples were retrieved from the Danish Neonatal Screening Biobank, within the Danish National Biobank, as part of the iPSYCH study. Neonatal DNA extractions and DNA methylation quantification were performed at the Statens Serum Institut (SSI, Copenhagen, Capital Region, Denmark), building on a previously described protocol . Briefly, from each dried blood spot sample two disks of 3.2 mm were used with the Extract-N-Amp Blood PCR kit (Sigma-Aldrich, St. Louis, USA) and eluted in 200 μL buffer. The isolated genomic DNA (160 μL) was converted with sodium bisulfite using the EZ-96 DNA Methylation Kit (Zymo Research, California, USA). DNA methylation was quantified across the genome using the Infinium HumanMethylation450k array (“450 K array”; Illumina, California, USA) and a modified protocol as previously described . Fully methylated and unmethylated control samples were included on each plate throughout each stage of processing.
MINERvA Illumina 450 K array data pre-processing and quality control
Signal intensities for 1316 neonatal blood samples, 14 fully methylated control samples, and 14 fully unmethylated control samples were imported into the R programming environment using the methylumIDAT() function in the methylumi package . Our stringent quality control (QC) pipeline included the following steps: 1) checking methylated and unmethylated signal intensities and excluding samples where either the median methylated or unmethylated intensity values were < 2500; 2) using the ten control probes to ensure the sodium bisulfite conversion was successful, excluding any samples with a median score < 80; 3) identifying the fully methylated and fully unmethylated control samples were in the correct location on each plate; 4) using the 65 SNP genotyping probes on the array to confirm no duplicate samples; 5) multidimensional scaling of data from probes on the X and Y chromosomes separately to confirm reported gender; 6) comparing genotype data for up to 65 SNP probes on the 450 K array with SNP array data; 7) using the pfilter() function in wateRmelon  to exclude samples with more than 1% of probes characterized by a detection P value > 0.05, in addition to probes characterized by > 1% of samples having a detection P value > 0.05. In total, 1263 samples (96.0%) passed all QC steps and were included in subsequent analyses. Normalization of the DNA methylation data was performed used the dasen() function in the wateRmelon package .
SNP genotyping and derivation of ASD polygenic risk scores
DNA was extracted at SSI as above and whole genome amplified in triplicate using the REPLI-g kit (Qiagen, Hilten, Germany). The triplicates were pooled and then quantified using Quant-iT picogreen (Invitrogen, California, USA). Samples were genotyped at the Broad Institute (Boston, Massachusetts, USA) using the Infinium PsychChip v1.0 array (Illumina, San Diego, California, USA) using a standard protocol. Phasing and imputation was done using SHAPEIT  and IMPUTE2 with haplotypes from the 1000 Genomes Project, phase 3 [45, 46] as described previously . ASD polygenic risk scores (PRSs) were generated as a weighted sum of associated variants as previously described . Briefly, results from the largest autism GWAS available from a combined effort by the Psychiatric Genomics Consortium (PGC) and iPSYCH  was used to select genetic variants and provide weights. As the MINERvA cohort is a subset of the broader iPSYCH cohort we used GWAS results excluding MINERvA samples, so that there was no overlap between the training cohort and the test cohort. Ten different significance thresholds (pT) from 5 × 10−8 to 1 were used to select sets of genetic variants, which were linkage disequilibrium (LD) clumped using plink with setting –clump-p1 1 –clump-p2 1 –cump-r2 0.1 –clump-kb 500 to generate PRSs.
All statistical analyses were performed using the R statistical environment version 3.2.2 . To test the validity and robustness of our blood spot DNA methylation measures, we implemented two DNA methylation clock algorithms to derive estimates for both age in years  and gestational age in weeks  for each sample. In addition, for each sample, we computed a score for prenatal exposure to maternal smoking using DNA methylation data as previously described by Elliott et al. . To identify DNA methylation sites associated with ASD status in the MINERvA discovery dataset, a linear model was fitted for each DNA methylation site with DNA methylation as the dependent variable, case/control status as an independent variable, and a set of possible confounders as covariates—sex, experimental array number, urbanicity level, birth month, birth year, gestational age, smoking, and cell composition variables estimated using the Houseman algorithm with a reference dataset for whole blood [49, 50]. Regional analysis to identify differentially methylated regions (DMRs) spanning multiple DNA methylation sites was performed using a sliding-window approach as previously described . Subsequent replication and meta-analysis was performed using summary statistics available from two US-based studies: the Study to Explore Early Development (SEED)  and the Simons Simplex Collection (SSC) . Meta-analysis to combine the epigenome-wide association study (EWAS) results from MINERvA, SEED, and SSC studies was performed for DNA methylation loci present in at least two of the three studies. Data quality control, normalization, and ASD EWAS analysis was performed separately for each of the replication cohorts. A complete description of the SEED and SSC datasets can be found elsewhere . The P values from the three independent EWAS analyses were combined using Fisher’s method, focusing on DMPs where the direction of effect was consistent across all studies. To identify DNA methylation sites associated with elevated autism polygenic risk burden, a linear model was used with DNA methylation as the dependent variable and ASD PRS, the number of non-missing genotypes contributing to the PRS, the first five genetic principal components, sex, experimental array number, six cell composition variables, smoking score, gestational age, and birth weight included as independent variables as described above. DNA methylation sites significantly associated with either ASD case control status or ASD PRS were identified at an experiment-wide significant threshold of P < 1 × 10−7, which is corrected for the number of DNA methylation sites profiled on the 450 K array.
DNA mQTL and co-localization analyses
All DNA methylation sites located within 250 kb of the three genome-wide significant genetic variants identified in the PGC-AUT GWAS  were identified and cis (defined as a 500-kb window) mQTL analysis was performed using the 1257 samples within MINERvA that had both DNA methylation and imputed genotype data. mQTL were identified using an additive linear model to test if the number of alleles (coded 0, 1, or 2) predicted DNA methylation at each site, including covariates for sex, and the first five principal components from the genotype data fitted using the MatrixEQTL package . Co-localization analysis was performed for each DNA methylation site as previously described  using the R coloc package (http://cran.r-project.org/web/packages/coloc). From both the PGC-AUT GWAS data and our mQTL results we inputted the regression coefficients, their variances and SNP minor allele frequencies, and the prior probabilities were left as their default values. This methodology quantifies the support across the results of each GWAS for five hypotheses by calculating the posterior probabilities, denoted as PPi for hypothesis Hi.
H 0 : there exist no causal variants for either trait;
H 1 : there exists a causal variant for one trait only, ASD;
H 2 : there exists a causal variant for one trait only, DNA methylation;
H 3 : there exist two distinct causal variants, one for each trait;
H 4 : there exists a single causal variant common to both traits.
Robust epigenetic signatures of gestational age and prenatal tobacco exposure validate DNA methylation data generated from neonatal blood spots
Characteristics of samples included in the MINERvA cohort
Birth yeara (%)
Gestational ageb (mean (sd))
2: Suburb of the capital
3: Municipalities having a town with more than 100,000 inhabitants
4: Municipalities having a town with between 10,000 and 100,000 inhabitants
5: Other municipalities in Denmark (largest town has less than 10,000 inhabitants)
Time to sampling (mean (sd))
Maternal age (mean (sd))
Paternal age (mean (sd))
Maternal smoking during pregnancy (%)
Smoke at any time
Maternal smoking amount during pregnancy (%)
5 or less cigarettes per day
6–10 cigarettes per day
11–20 cigarettes per day
21 or more cigarettes per day
Birth weight (mean (sd))
Methylomic variation in perinatal blood is not significantly associated with childhood autism
Our initial analysis focused on identifying neonatal blood DNA methylation differences among MINERvA neonates who went on to later develop a childhood diagnosis of ASD. No global differences in DNA methylation—estimated by averaging across all probes on the array included in our analysis—were identified between ASD patients (N = 629) and controls (N = 634) (ASD mean = 50.0%, ASD sd = 0.0811%; controls mean = 50.0%, controls sd = 0.0917%; t-test P = 0.695). Using a linear model to identify DNA methylation differences in ASD cases compared to controls we did not identify any differentially methylated positions (DMPs) passing an experiment-wide significance threshold adjusted for multiple testing (P < 1 × 10−7). Twenty ASD-associated DMPs were identified at a “discovery” threshold of P < 5 × 10−5 (Additional file 1: Figures S5 and S6; Additional file 2: Table S2); the most significant association was at cg12699865, which is located the 5′ UTR of RALY where the mean level of DNA methylation was 0.647% lower (P = 7.63 × 10−7) in ASD cases compared to controls (Additional file 1: Figure S7). Regional analysis combining the EWAS P values for DNA methylation sites within a sliding window across the genome (see “Methods”) did not identify any significant ASD-associated DMRs after correcting for multiple testing. Given the higher prevalence of ASD diagnosis in males, we also tested for an interaction between autism status and sex but identified no significant associations (P < 1 × 10−7) and only seven DMPs at our discovery threshold of P < 5 × 10−5 (Additional file 2: Table S3).
Increased polygenic burden for autism is associated with methylomic variation in blood at birth
Alignment of DNA methylation quantitative trait loci and ASD genetic signals
In this study, we quantified neonatal methylomic variation in 1263 infants selected from the iPSYCH cohort  including samples from individuals who went on to develop ASD and carefully matched control samples. It represents the first attempt to integrate analyses of both genetic and epigenetic variation at birth in ASD, demonstrating the utility of using a polygenic risk score to identify molecular variation associated with disease, and of using DNA methylation quantitative trait loci to refine the functional and regulatory variation associated with ASD risk variants. While ASD itself was not associated with significant differences in neonatal DNA methylation, at an experiment-wide significance threshold, increased polygenic burden for autism was found to be associated with methylomic variation at specific loci in blood at birth. Our analysis of ASD PRS and DNA methylation supplements an increasing body of literature investigating the effects of high genetic burden for other complex traits on molecular variation [34, 57, 58]. We find that two CpGs located on chromosome 8 are associated with genetic risk for ASD, and are proximal to a robust GWAS signal for ASD. Furthermore, multiple associated SNPs on chromosome 8 have a polygenic effect on DNA methylation at these two CpG sites, demonstrating how a complex genetic architecture can converge on a common molecular consequence.
This study has several advantages over previous analyses of DNA methylation in ASD. We assessed a relatively large set of samples that is balanced with regard to both disease status and numbers of males and females. This contrasts with previous studies that have been undertaken on much smaller numbers of samples and focused primarily on ASD in males. Our control samples were stringently matched to cases on the basis of a number of criteria (see “Methods”) to minimize the effects of confounding variables that often lead to false positives in molecular epidemiology. Furthermore, our use of neonatal DNA samples—collected before diagnosis and the manifestation of any ASD symptoms—means that we are uniquely positioned to identify epigenetic variation associated with later disease or elevated polygenic burden for later ASD, avoiding the confounding exposures often associated with disease (for example, medication, stress, and reverse causation) . Finally, our study profiled whole blood from neonatal infants rather than cord blood; this minimizes confounding by maternal blood DNA and means our data can be more easily compared to blood datasets derived from later in life. A limitation of our sampling strategy, however, is that no blood cell reference DNA datasets specifically for use on neonatal blood are yet available, likely reflecting the difficulties of obtaining sufficient volumes of neonatal blood for cell sorting and methylomic profiling. Instead, we corrected for blood cell-type composition using algorithms developed using adult datasets which may not fully represent the cellular diversity observed in neonatal blood.
We find little evidence to support an association between DNA methylation at birth and ASD, confirming this finding in a meta-analysis of three studies with a total sample of 2917. Power calculations show that we have > 90% power in our meta-analysis to identify an ASD-associated difference of 0.3% and a difference of 0.7% in the MINERvA cohort alone. While this suggests the lack of association was not due to sample size, we cannot fully conclude that DNA methylation is not associated with the onset of ASD. First, our analyses were constrained by the technical limitations of the Illumina 450K array, which only assays ~ 3% of CpG sites in the genome. Second, this work necessitated the use of a peripheral tissue that may provide limited information about variation in the presumed tissue of interest, i.e., the brain . Although this is a salient point for understanding the role DNA methylation plays in the disease process, biomarkers—by definition—need to be measured in an accessible tissue and therefore justify the use of blood from neonates in this study. Third, given the chronology of sample collection prior to ASD diagnosis, it is plausible that we were looking too early on in the disease process. Another limitation of our study is the possibility of diagnostic misclassification; however, validation of select diagnoses (e.g., schizophrenia, single-episode depression, dementia, and childhood autism) has been previously performed with good results [39, 65].
In contrast, we find that polygenic burden for ASD is robustly associated with DNA methylation at two CpG sites on chromosome 8, with 49 DMPs associated with ASD polygenic burden at a more relaxed “discovery” P value threshold. Of note, both sites flank a significant genetic association signal identified in the latest ASD GWAS and our data suggest that the PRS-associated variation at these sites results from the combined effects of multiple genetic variants associated with ASD in this region. Finally, we have used mQTL analyses to annotate this extended genomic region nominated by GWAS analyses of ASD, using co-localization analyses to highlight potential regulatory variation causally involved in disease. Of interest, we found evidence that several SNPs on chromosome 20 were associated with both ASD and DNA methylation and the genes annotated to these sites (KIZ, XRN2, and NKX2–4) represent putative candidates for a potential functional role in ASD. The mechanisms linking DNA sequence variation to alterations in DNA methylation and other epigenetic modifications are not yet well understood; further exploration of these processes is warranted to provide insight into the functional consequences of disease-associated genetic variation.
Our data provide evidence for differences in DNA methylation at birth associated with an elevated polygenic burden for ASD. Our study represents the first analysis of epigenetic variation at birth associated with autism and highlights the utility of polygenic risk scores for identifying molecular pathways associated with etiological variation.
The iPSYCH-Broad ASD Group contains the following participants:
Thomas D. Als
(see Additional file 4 for full listing of e-mail addresses and affiliations).
This study was supported by grant HD073978 from the Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institute of Environmental Health Sciences, and National Institute of Neurological Disorders and Stroke; and by the Beatrice and Samuel A. Seaver Foundation. We acknowledge iPSYCH and The Lundbeck Foundation for providing samples and funding. The iPSYCH (The Lundbeck Foundation Initiative for Integrative Psychiatric Research) team acknowledges funding from The Lundbeck Foundation (grant numbers R102-A9118 and R155–2014-1724), the Stanley Medical Research Institute, the European Research Council (project number 294838), the Novo Nordisk Foundation for supporting the Danish National Biobank resource, and grants from Aarhus and Copenhagen Universities and University Hospitals, including support to the iSEQ Center, the GenomeDK HPC facility, and the CIRRAU Center. This research has been conducted using the Danish National Biobank resource, supported by the Novo Nordisk Foundation. JM is supported by funding from the UK Medical Research Council (MR/K013807/1) and a Distinguished Investigator Award from the Brain & Behavior Research Foundation. The SEED study was supported by Centers for Disease Control and Prevention (CDC) Cooperative Agreements announced under the RFAs 01086, 02199, DD11–002, DD06–003, DD04–001, and DD09–002 and the SEED DNA methylation measurements were supported by Autism Speaks Award #7659 to MDF. SA was supported by the Burroughs-Wellcome Trust training grant: Maryland, Genetics, Epidemiology and Medicine (MD-GEM). The SSC was supported by Simons Foundation (SFARI) award and NIH grant MH089606, both awarded to STW.
Availability of data and materials
Given the nature of the MINERvA cohort, access to data can only be provided through secured systems which comply with the current Danish and EU data standards. To comply with the study’s ethical approval, access to the raw data is only available to qualified researchers upon request. All summary statistics and analysis scripts are available directly from the authors (please contact Jonas Grauholm at JOGR@ssi.dk). R scripts used to perform the analyses reported in this manuscript are available on GitHub (https://github.com/ejh243/MinervaASDEWAS.git) and have been archived in Zenado at https://zenodo.org/badge/latestdoi/116149862.
AR, DS, and JM designed and coordinated the study. GB-G, DMH, MVH, M-BH, and CSH led generation of DNA methylation data from dried neonatal bloodspots. AR, DS, JM, EH, JB-G, CL-A, and MDF oversaw implementation of the data analyses. EH led data analysis. CL-A and SVA analyzed data from replication datasets. JG provided autism polygenic risk scores. DMH, OM, PBM, ADB, TW, and MN are principal investigators of the iPSYCH study and obtained funding for genetic data. EH and JM drafted the manuscript, with input from AR, DS, CL-A, JG, SVA, MDF, MB, MH, JB, and JB-G. All coauthors read and approved the final manuscript.
Ethics approval and consent to participate
The MINERvA study has been approved by the Regional Scientific Ethics Committee in Denmark, the Danish Data Protection Agency and the NBS-Biobank Steering Committee. iPSYCH is a register-based cohort study solely using data from national health registries. The study was approved by the Scientific Ethics Committees of the Central Denmark Region (www.komite.rm.dk; J.nr. 1–10–72-287-12) and executed according to guidelines from the Danish Data Protection Agency (www.datatilsynet.dk; J.nr.: 2012–41-0110). Passive consent was obtained, in accordance with Danish Law nr. 593 of June 14, 2011, para 10, on the scientific ethics administration of projects within health research. Permission to use the dried blood spot samples stored in the Danish Neonatal Screening Biobank (DNSB) was granted by the steering committee of DNSB (SEP 2012/BNP). Research was conducted in accordance with the principles of the Declaration of Helsinki.
Consent for publication
TW has acted as advisor and lecturer to H. Lundbeck A/S. The remaining authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- American Psychiatric Association. Diagnostic and statistical manual of mental disorders. 4th ed. Washington, DC: The American Psychiatric Association; 2000.Google Scholar
- Baron-Cohen S, Scott FJ, Allison C, Williams J, Bolton P, Matthews FE, Brayne C. Prevalence of autism-spectrum conditions: UK school-based population study. Br J Psychiatry. 2009;194:500–9.View ArticlePubMedGoogle Scholar
- Investigators ADDMNSYP, CfDCa P. Prevalence of autism spectrum disorders--Autism and Developmental Disabilities Monitoring Network, 14 sites, United States, 2008. MMWR Surveill Summ. 2012;61:1–19.Google Scholar
- Christensen DL, Baio J, Van Naarden BK, Bilder D, Charles J, Constantino JN, Daniels J, Durkin MS, Fitzgerald RT, Kurzius-Spencer M, et al. Prevalence and characteristics of autism spectrum disorder among children aged 8 years--Autism and Developmental Disabilities Monitoring Network, 11 Sites, United States, 2012. MMWR Surveill Summ. 2016;65:1–23.View ArticlePubMedGoogle Scholar
- Robinson EB, St Pourcain B, Anttila V, Kosmicki JA, Bulik-Sullivan B, Grove J, Maller J, Samocha KE, Sanders SJ, Ripke S, et al. Genetic risk for autism spectrum disorders and neuropsychiatric variation in the general population. Nat Genet. 2016;48:552–5.View ArticlePubMedPubMed CentralGoogle Scholar
- Consortium C-DGPG. Identification of risk loci with shared effects on five major psychiatric disorders: a genome-wide analysis. Lancet. 2013;381:1371–9.View ArticleGoogle Scholar
- Hallmayer J, Cleveland S, Torres A, Phillips J, Cohen B, Torigoe T, Miller J, Fedele A, Collins J, Smith K, et al. Genetic heritability and shared environmental factors among twin pairs with autism. Arch Gen Psychiatry. 2011;68:1095–102.View ArticlePubMedPubMed CentralGoogle Scholar
- Iossifov I, O'Roak BJ, Sanders SJ, Ronemus M, Krumm N, Levy D, Stessman HA, Witherspoon KT, Vives L, Patterson KE, et al. The contribution of de novo coding mutations to autism spectrum disorder. Nature. 2014;515:216–21.View ArticlePubMedPubMed CentralGoogle Scholar
- Sanders SJ, Ercan-Sencicek AG, Hus V, Luo R, Murtha MT, Moreno-De-Luca D, Chu SH, Moreau MP, Gupta AR, Thomson SA, et al. Multiple recurrent de novo CNVs, including duplications of the 7q11.23 Williams syndrome region, are strongly associated with autism. Neuron. 2011;70:863–85.View ArticlePubMedPubMed CentralGoogle Scholar
- Sanders SJ, Murtha MT, Gupta AR, Murdoch JD, Raubeson MJ, Willsey AJ, Ercan-Sencicek AG, DiLullo NM, Parikshak NN, Stein JL, et al. De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature. 2012;485:237–41.View ArticlePubMedPubMed CentralGoogle Scholar
- Sebat J, Lakshmi B, Malhotra D, Troge J, Lese-Martin C, Walsh T, Yamrom B, Yoon S, Krasnitz A, Kendall J, et al. Strong association of de novo copy number mutations with autism. Science. 2007;316:445–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Schizophrenia Working Group of the PGC, Ripke S, Neale B, Corvin A, Walters J, Farh K, Holmans P, Lee P, Bulik-Sullivan B, Collier D, et al. Biological insights from 108 schizophrenia-associated genetic loci. Nature. 2014;511:421.View ArticleGoogle Scholar
- Grove J, Ripke S, Als TD, Mattheisen M, Walters R, Won H, Pallesen J, Agerbo E, Andreassen OA, Anney R, et al. Common risk variants identified in autism spectrum disorder. bioRxiv. 2017. https://www.biorxiv.org/content/early/2017/11/25/224774.
- Schaub MA, Boyle AP, Kundaje A, Batzoglou S, Snyder M. Linking disease associations with regulatory information in the human genome. Genome Res. 2012;22:1748–59.View ArticlePubMedPubMed CentralGoogle Scholar
- Siu MT, Weksberg R. Epigenetics of autism spectrum disorder. Adv Exp Med Biol. 2017;978:63–90.View ArticlePubMedGoogle Scholar
- Vogel Ciernia A, LaSalle J. The landscape of DNA methylation amid a perfect storm of autism aetiologies. Nat Rev Neurosci. 2016;17:411–23.View ArticlePubMedGoogle Scholar
- Wong CC, Meaburn EL, Ronald A, Price TS, Jeffries AR, Schalkwyk LC, Plomin R, Mill J. Methylomic analysis of monozygotic twins discordant for autism spectrum disorder and related behavioural traits. Mol Psychiatry. 2014;19:495–503.View ArticlePubMedGoogle Scholar
- Ladd-Acosta C, Hansen KD, Briem E, Fallin MD, Kaufmann WE, Feinberg AP. Common DNA methylation alterations in multiple brain regions in autism. Mol Psychiatry. 2014;19:862–71.View ArticlePubMedGoogle Scholar
- Nardone S, Sams DS, Reuveni E, Getselter D, Oron O, Karpuj M, Elliott E. DNA methylation analysis of the autistic brain reveals multiple dysregulated biological pathways. Transl Psychiatry. 2014;4:e433.View ArticlePubMedPubMed CentralGoogle Scholar
- Nguyen A, Rauch TA, Pfeifer GP, Hu VW. Global methylation profiling of lymphoblastoid cell lines reveals epigenetic contributions to autism spectrum disorders and a novel autism candidate gene, RORA, whose protein product is reduced in autistic brain. FASEB J. 2010;24:3036–51.View ArticlePubMedPubMed CentralGoogle Scholar
- Homs A, Codina-Solà M, Rodríguez-Santiago B, Villanueva CM, Monk D, Cuscó I, Pérez-Jurado LA. Genetic and epigenetic methylation defects and implication of the ERMN gene in autism spectrum disorders. Transl Psychiatry. 2016;6:e855.View ArticlePubMedPubMed CentralGoogle Scholar
- Sun W, Poschmann J, Cruz-Herrera Del Rosario R, Parikshak NN, Hajan HS, Kumar V, Ramasamy R, Belgard TG, Elanggovan B, Wong CC, et al. Histone acetylome-wide association study of autism spectrum disorder. Cell. 2016;167:1385–97.View ArticlePubMedGoogle Scholar
- Elliott HR, Tillin T, McArdle WL, Ho K, Duggirala A, Frayling TM, Davey Smith G, Hughes AD, Chaturvedi N, Relton CL. Differences in smoking associated DNA methylation patterns in South Asians and Europeans. Clin Epigenetics. 2014;6:4.View ArticlePubMedPubMed CentralGoogle Scholar
- Joubert BR, Felix JF, Yousefi P, Bakulski KM, Just AC, Breton C, Reese SE, Markunas CA, Richmond RC, Xu CJ, et al. DNA Methylation in newborns and maternal smoking in pregnancy: genome-wide consortium meta-analysis. Am J Hum Genet. 2016;98:680–96.View ArticlePubMedPubMed CentralGoogle Scholar
- Shenker NS, Polidoro S, van Veldhoven K, Sacerdote C, Ricceri F, Birrell MA, Belvisi MG, Brown R, Vineis P, Flanagan JM. Epigenome-wide association study in the European Prospective Investigation into Cancer and Nutrition (EPIC-Turin) identifies novel genetic loci associated with smoking. Hum Mol Genet. 2013;22:843–51.View ArticlePubMedGoogle Scholar
- Non AL, Binder AM, Kubzansky LD, Michels KB. Genome-wide DNA methylation in neonates exposed to maternal depression, anxiety, or SSRI medication during pregnancy. Epigenetics. 2014;9:964–72.View ArticlePubMedPubMed CentralGoogle Scholar
- Gurnot C, Martin-Subero I, Mah SM, Weikum W, Goodman SJ, Brain U, Werker JF, Kobor MS, Esteller M, Oberlander TF, Hensch TK. Prenatal antidepressant exposure associated with CYP2E1 DNA methylation change in neonates. Epigenetics. 2015;10:361–72.View ArticlePubMedPubMed CentralGoogle Scholar
- Panni T, Mehta AJ, Schwartz JD, Baccarelli AA, Just AC, Wolf K, Wahl S, Cyrys J, Kunze S, Strauch K, et al. A genome-wide analysis of DNA methylation and fine particulate matter air pollution in three study populations: KORA F3, KORA F4, and the normative aging study. Environ Health Perspect. 2016;124(7):983–90.View ArticlePubMedPubMed CentralGoogle Scholar
- Mill J, Heijmans BT. From promises to practical strategies in epigenetic epidemiology. Nat Rev Genet. 2013;14:585–94.View ArticlePubMedGoogle Scholar
- Hannon E, Spiers H, Viana J, Pidsley R, Burrage J, Murphy TM, Troakes C, Turecki G, O'Donovan MC, Schalkwyk LC, et al. Methylation QTLs in the developing brain and their enrichment in schizophrenia risk loci. Nat Neurosci. 2016;19(1):48–54.View ArticlePubMedGoogle Scholar
- Gaunt TR, Shihab HA, Hemani G, Min JL, Woodward G, Lyttleton O, Zheng J, Duggirala A, McArdle WL, Ho K, et al. Systematic identification of genetic influences on methylation across the human life course. Genome Biol. 2016;17:61.View ArticlePubMedPubMed CentralGoogle Scholar
- Heyn H, Moran S, Hernando-Herraez I, Sayols S, Gomez A, Sandoval J, Monk D, Hata K, Marques-Bonet T, Wang L, Esteller M. DNA methylation contributes to natural human variation. Genome Res. 2013;23:1363–72.View ArticlePubMedPubMed CentralGoogle Scholar
- Smith AK, Kilaru V, Kocak M, Almli LM, Mercer KB, Ressler KJ, Tylavsky FA, Conneely KN. Methylation quantitative trait loci (meQTLs) are consistently detected across ancestry, developmental stage, and tissue type. BMC Genomics. 2014;15:145.View ArticlePubMedPubMed CentralGoogle Scholar
- Hannon E, Dempster E, Viana J, Burrage J, Smith AR, Macdonald R, St Clair D, Mustard C, Breen G, Therman S, et al. An integrated genetic-epigenetic analysis of schizophrenia: evidence for co-localization of genetic associations and differential DNA methylation. Genome Biol. 2016;17:176.View ArticlePubMedPubMed CentralGoogle Scholar
- Knight AK, Craig JM, Theda C, Bækvad-Hansen M, Bybjerg-Grauholm J, Hansen CS, Hollegaard MV, Hougaard DM, Mortensen PB, Weinsheimer SM, et al. An epigenetic clock for gestational age at birth based on blood methylation data. Genome Biol. 2016;17:206.View ArticlePubMedPubMed CentralGoogle Scholar
- Horvath S. DNA methylation age of human tissues and cell types. Genome Biol. 2013;14:R115.View ArticlePubMedPubMed CentralGoogle Scholar
- Nørgaard-Pedersen B, Hougaard DM. Storage policies and use of the Danish Newborn Screening Biobank. J Inherit Metab Dis. 2007;30:530–6.View ArticlePubMedGoogle Scholar
- Pedersen CB, Bybjerg-Grauholm J, Pedersen MG, Grove J, Agerbo E, Bækvad-Hansen M, Poulsen JB, Hansen CS, McGrath JJ, Als TD, et al. The iPSYCH2012 case-cohort sample: new directions for unravelling genetic and environmental architectures of severe mental disorders. Mol Psychiatry. 2018;23(1):6–14.View ArticlePubMedGoogle Scholar
- Mors O, Perto GP, Mortensen PB. The Danish Psychiatric Central Research Register. Scand J Public Health. 2011;39:54–7.View ArticlePubMedGoogle Scholar
- Lynge E, Sandegaard JL, Rebolj M. The Danish National Patient Register. Scand J Public Health. 2011;39:30–3.View ArticlePubMedGoogle Scholar
- Hollegaard MV, Grauholm J, Nørgaard-Pedersen B, Hougaard DM. DNA methylome profiling using neonatal dried blood spot samples: a proof-of-principle study. Mol Genet Metab. 2013;108:225–31.View ArticlePubMedGoogle Scholar
- Davis S, Du P, Bilke S, Triche J, Bootwalla M. methylumi: Handle Illumina methylation data. R package version 2.14.0.; 2015. https://www.bioconductor.org/packages/release/bioc/html/methylumi.html.Google Scholar
- Pidsley R, Wong CC, Volta M, Lunnon K, Mill J, Schalkwyk LC. A data-driven approach to preprocessing Illumina 450K methylation array data. BMC Genomics. 2013;14:293.View ArticlePubMedPubMed CentralGoogle Scholar
- Delaneau O, Marchini J, Zagury JF. A linear complexity phasing method for thousands of genomes. Nat Methods. 2012;9:179–81.View ArticleGoogle Scholar
- Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM, Handsaker RE, Kang HM, Marth GT, McVean GA, Consortium GP. An integrated map of genetic variation from 1,092 human genomes. Nature. 2012;491:56–65.View ArticlePubMedGoogle Scholar
- Sudmant PH, Rausch T, Gardner EJ, Handsaker RE, Abyzov A, Huddleston J, Zhang Y, Ye K, Jun G, Fritz MH, et al. An integrated map of structural variation in 2,504 human genomes. Nature. 2015;526:75–81.View ArticlePubMedPubMed CentralGoogle Scholar
- Purcell SM, Wray NR, Stone JL, Visscher PM, O'Donovan MC, Sullivan PF, Sklar P, Consortium IS. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature. 2009;460:748–52.PubMedGoogle Scholar
- R Core Team (2014). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.r-project.org/.
- Houseman EA, Accomando WP, Koestler DC, Christensen BC, Marsit CJ, Nelson HH, Wiencke JK, Kelsey KT. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinformatics. 2012;13:86.View ArticlePubMedPubMed CentralGoogle Scholar
- Koestler DC, Christensen B, Karagas MR, Marsit CJ, Langevin SM, Kelsey KT, Wiencke JK, Houseman EA. Blood-based profiles of DNA methylation predict the underlying distribution of cell types: a validation analysis. Epigenetics. 2013;8:816–26.View ArticlePubMedPubMed CentralGoogle Scholar
- Schendel DE, Diguiseppi C, Croen LA, Fallin MD, Reed PL, Schieve LA, Wiggins LD, Daniels J, Grether J, Levy SE, et al. The Study to Explore Early Development (SEED): a multisite epidemiologic study of autism by the Centers for Autism and Developmental Disabilities Research and Epidemiology (CADDRE) network. J Autism Dev Disord. 2012;42:2121–40.View ArticlePubMedPubMed CentralGoogle Scholar
- Fischbach GD, Lord C. The Simons Simplex Collection: a resource for identification of autism genetic risk factors. Neuron. 2010;68:192–5.View ArticlePubMedGoogle Scholar
- Andrews SV, Ellis SE, Bakulski KM, Sheppard B, Croen LA, Hertz-Picciotto I, Newschaffer CJ, Feinberg AP, Arking DE, Ladd-Acosta C, Fallin MD. Cross-tissue integration of genetic and epigenetic data offers insight into autism spectrum disorder. Nat Commun. 2017;8(1):1011.View ArticlePubMedPubMed CentralGoogle Scholar
- Shabalin AA. Matrix eQTL: ultra fast eQTL analysis via large matrix operations. Bioinformatics. 2012;28:1353–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Giambartolomei C, Vukcevic D, Schadt EE, Franke L, Hingorani AD, Wallace C, Plagnol V. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 2014;10:e1004383.View ArticlePubMedPubMed CentralGoogle Scholar
- Gaugler T, Klei L, Sanders SJ, Bodea CA, Goldberg AP, Lee AB, Mahajan M, Manaa D, Pawitan Y, Reichert J, et al. Most genetic risk for autism resides with common variation. Nat Genet. 2014;46:881–5.View ArticlePubMedPubMed CentralGoogle Scholar
- Viana J, Hannon E, Dempster E, Pidsley R, Macdonald R, Knox O, Spiers H, Troakes C, Al-Saraj S, Turecki G, et al. Schizophrenia-associated methylomic variation: molecular signatures of disease and polygenic risk burden across multiple brain regions. Hum Mol Genet. 2017;26(1):210–25.PubMedGoogle Scholar
- Fromer M, Roussos P, Sieberts SK, Johnson JS, Kavanagh DH, Perumal TM, Ruderfer DM, Oh EC, Topol A, Shah HR, et al. Gene expression elucidates functional impact of polygenic risk for schizophrenia. Nat Neurosci. 2016;19:1442–53.View ArticlePubMedPubMed CentralGoogle Scholar
- Ernst J, Kellis M. Chromatin-state discovery and genome annotation with ChromHMM. Nat Protoc. 2017;12:2478–92.View ArticlePubMedGoogle Scholar
- Ernst J, Kellis M. ChromHMM: automating chromatin-state discovery and characterization. Nat Methods. 2012;9:215–6.View ArticlePubMedPubMed CentralGoogle Scholar
- Consortium RE, Kundaje A, Meuleman W, Ernst J, Bilenky M, Yen A, Heravi-Moussavi A, Kheradpour P, Zhang Z, Wang J, et al. Integrative analysis of 111 reference human epigenomes. Nature. 2015;518:317–30.View ArticleGoogle Scholar
- Maurano MT, Humbert R, Rynes E, Thurman RE, Haugen E, Wang H, Reynolds AP, Sandstrom R, Qu H, Brody J, et al. Systematic localization of common disease-associated variation in regulatory DNA. Science. 2012;337:1190–5.View ArticlePubMedPubMed CentralGoogle Scholar
- Heijmans BT, Mill J. The seven plagues of epigenetic epidemiology. Int J Epidemiol. 2012;41:74–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Hannon E, Lunnon K, Schalkwyk L, Mill J. Interindividual methylomic variation across blood, cortex, and cerebellum: implications for epigenetic studies of neurological and neuropsychiatric phenotypes. Epigenetics. 2015;10:1024–32.View ArticlePubMedPubMed CentralGoogle Scholar
- Lauritsen MB, Jørgensen M, Madsen KM, Lemcke S, Toft S, Grove J, Schendel DE, Thorsen P. Validity of childhood autism in the Danish Psychiatric Central Register: findings from a cohort sample born 1990-1999. J Autism Dev Disord. 2010;40:139–48.View ArticlePubMedGoogle Scholar