The details of study participants, plasma ApoE measurement, genotyping and imputation, statistical analyses, and functional annotations are given in Online Methods.
Association of plasma ApoE concentration with incident dementia and cognition function
Whole plasma ApoE concentration at baseline was determined in 3031 participants, of which 2893 were European Americans (EAs), including 2412 who remained non-demented (ND) and 481 with incident dementia (91.4% AD dementia) (Fig. 1). Plasma ApoE level followed a symmetric distribution, ranging from 0.50 to 15.70 mg/dl with a mean value of 4.1 ± 1.25 mg/dL in the total sample and 4.09 ± 1.25 mg/dL in EAs (Fig. S1a). ApoE level was significantly higher in females than males (P = 1.18E-24; Fig. S1b) and exposure to Ginkgo biloba showed no impact on plasma ApoE level (P = 0.769; Fig. S2).
To determine the association of plasma ApoE level with risk of incident dementia, we obtained hazard ratios (HRs) per 1-standard deviation (SD) lower ApoE level in 3031subjects using the Cox regression model adjusted for baseline age, sex, ethnicity, education, BMI, and the research site. For cognitive function, differences in cognitive scores per 1-SD decrease ApoE were obtained for subscale of the Alzheimer Disease Assessment Scale (ADAS-cog) and Modified Mini-Mental State Examination (3MSE) in all subjects from linear regression using the same covariates. While lower ApoE concentration in whole plasma was not associated with either incident dementia (HR = 1.00; 95% CI: 0.91–1.10) or AD (HR = 1.01: 95% CI: 0.92 to 1.12), it was associated with higher ADAS-cog scores, indicating worse cognitive function (β coefficient = 0.08; 95% CI: 0.01 to 0.18). A similar, but non-significant, association of lower ApoE concentration was observed with lower 3MSE scores, indicating worse cognitive function (β coefficient = −0.13 95% CI: −0.29 to 0.02).
Plasma ApoE is associated with multiple lipoprotein particles, which also contain other apolipoproteins, and about 50% of ApoE is present on high-density lipoprotein (HDL) [20]. Given its important role in lipid metabolism and AD dementia, it is possible that the association of ApoE level with dementia or cognitive function is modulated by its association with other apolipoproteins. To address this question, plasma lipoprotein-lipid along with HDL subfractions were determined in subset of the GEM sample comprising 1351 subjects [21]. While no association of baseline ApoE present in non-HDL or HDL particles was detected with incident dementia, lower ApoE level was significantly associated with higher ADAS-cog scores only in HDL (β = 0.20; 95% CI: 0.10 to 0.30) [21], as we also observed in this study in whole plasma in the total GEM sample of 3031 subjects. When this association in HDL was further examined in 1351 subjects based on the presence or absence of ApoC3 in HDL [21], this was confined to HDL lacking ApoC3 not only with ADAS-cog (β = 0.17; 95% CI: 0.07 to 0.27), but also with significantly lower 3MSE scores (β = −0.25; 95% CI: −0.42 to −0.07) as well as with incident and AD dementia (HR = 1.16; 95% CI: 1.03 to 1.32). These data showed that the presence or absence of ApoC3 in HDL modulates the association of plasma ApoE levels with dementia and cognitive function.
Genome-wide association analysis
Of 3031 subjects with plasma ApoE, DNA was available on 2737 participants (96.1% EAs) for genetic studies. Since the number of non-EAs was small, we included only EA participants in genetic analyses. A linear regression using the PLINK software was performed on 2580 EAs, including 2199 ND and 381 incident AD dementia cases (Table 1). A quantile-quantile plot did not demonstrate population stratification (λ = 1.005) (Fig. S3a). Six regions on chromosomes 1, 4, 7, 11, 19, and 20 showed genome-wide significant (GWS) signals (P < 5E-08) along with a subthreshold GWS signal (P = 5.49E-08) on chromosome 5 (Fig. 2).
Associations in the APOE region
As expected, most of the significantly associated SNPs were from chromosome 19, where 57 SNPs surpassed the GWS threshold (Table S1; Fig. S3b). The most significant association was observed for APOE*2/rs7412, which was associated with elevating ApoE levels (β = 1.11; P = 4.73E-79). As expected, APOE*4/rs429358 was associated with lowering ApoE levels (β = −0.352; P = 8.73E-12). Since rs7412 and rs429358 correspond to the common APOE 2/3/4 polymorphism having six genotypes, we examined plasma ApoE levels among these genotypes (Fig. 3). The highest ApoE levels were observed in E2/2 homozygotes with a gradual decrease of 29%, 34%, 43%, 46%, and 49% in the 2/3, 2/4, 3/3, 3/4, and 4/4 genotypes, respectively.
There were several additional signals in the APOE region associated with increasing (42 SNPs) and decreasing (15 SNPs) plasma ApoE levels (Table S1). While APOE*2/rs7412 was the most significant SNP associated with higher ApoE levels, APOC1P1–APOC4/rs35136575 was the most significant SNP associated with lower ApoE levels (β = −0.3799; P = 6.34E-24), showing an even stronger effect than APOE*4. To determine which signals were independent, we conducted conditional analyses to find SNPs that were still significant after controlling for these two SNPs followed by an examination of linkage disequilibrium (LD) plots to identify representative SNPs in each cluster. Ten SNPs with positive-β remained significant (P range = 1.51E-02 to 1.69E-07) after adjusting for the effect of APOE*2 (Table 2). All SNPs with negative-β remained significant after adjusting for the effect of APOC1P1-APOC4/rs35136575, of which eight were still GWS (Table 2). All 15 SNPs with negative-β had essentially no LD with rs7412 (R2 = 0 to 0.06) and five of them remained GWS after adjusting for rs7412.
Based on LD among these 26 SNPs (Fig. S4), we identified 10 independent signals, including five with lowering effect: APOC4/rs35136575(intergenic), APOE/rs429358(p.Cys112Arg), APOE/rs4055 09(promoter), APOC1/rs157595(intergenic) and APOC1P1/rs5112 (ncRNA-exonic); and five with elevating effect: APOE/rs7412 (p.Arg158Cys), APOE/rs769446(promoter), CBLC/rs148933445(intronic), APOC1/rs144311893(intergenic), and APOC1P/rs114448690 (ncRNA-intronic). An additional partially independent signal associated with lowering effect on ApoE level was driven by an LD block of 5 SNPs (rs12972156, rs34342646, rs6857, rs71352238, rs115 56505) in PVRL2 (NECTIN2) and TOMM40. Three of these are potentially functional: PVRL2/rs6857(3’UTR), TOMM40/rs71352238 (promoter), TOMM40/rs11556505 (p.Phe131Leu). While PVRL2/rs6857 is in LD (R2 = 0.69) with APOE*4/rs429358, the other two are only in a moderate LD (R2 = 0.46– 0.48) with the latter (Fig. S4) and thus may represent a partial independent additional signal.
Novel Associations
In addition to known APOE*2/E*4 SNPs and the additional independent signals we discovered in the APOE region, we identified seven novel signals on chromosomes 1, 4, 5, 7, 11, 12 and 20; all were associated with elevated plasma ApoE levels (Table 3; Fig. 4). While the two novel SNPs were genotyped on the chip (ZPR1/rs964184 and BMP2/rs73894435), the remaining were imputed. For this reason, we genotyped all carriers of the minor allele for the imputed SNPs using TaqMan assays and confirmed the imputed calls. The strongest novel signal, rs114661586, was present in intron 1 of OPRD1 (P = 5.36E-10) followed by rs73894435 near BMP2 (P = 9.64E-09), rs149497036 in intron 16 of PHF14 (P = 9.67E-09), rs964184 in 3’UTR of ZPR1/ZNF259 (P = 2.58E-08) and rs142344853 near GBA3,PPARGC1A (P = 4.31E-08). A subthreshold GWS signal observed on chromosome 5 near PLK2, GAPT (rs72758175; P = 5.49E-08) became GWS (P = 4.53E-09) after adjusting for APOE*2/rs7412. On the other hand, the chromosome 20 signal lost its GWS after adjusting for APOE*2/rs7412 (P = 2.21E-05), indicating a possible interaction between the two. However, we found no significant interaction between the two SNPs (P = 0.0661), indicating a biological rather than a statistical interaction that impacts plasma ApoE levels. After adjusting for APOC1P1-APOC4/rs35136575, an additional novel signal was observed on chromosome 12 near AVIL-TSFM/rs2470341 (P = 4.44E-08).
a Regional plot in the OPRD1 locus on chromosome 1; (b) Regional plot in the GBA3,PPARGC1A locus on chromosome 4; (c) Regional plot in the PLK2 locus on chromosome 5; (d) Regional plot in the PHF14 locus on chromosome 7; (e) Regional plot in the ZPR1 (ZNF259)/APOA5 locus on chromosome 11; (f) Regional plot in the LINC02403 locus on chromosome 12 after adjusting for APOC1P1,APOC4/rs35136575; (g) Regional plot in the BMP2 locus on chromosome 20.
On chromosome 11, in addition to the top SNP ZPR1/rs964184, we also observed multiple suggestive associations (P range= 9.56E-07 to 2.62E-07), which remained significant after conditioned on the top SNP (Table S2). After considering the LD pattern between these SNPs and the top signal (Fig. S5), we identified three additional independent signals: BUD13/rs180326 (P = 2.62E-07), SIK3/rs4936359 (P = 4.85E-07); and ZPR1/rs35120633(p.A264V) (P = 3.48E-07).
Estimation of plasma ApoE levels variance by APOE and non-APOE loci
The variance explained by a linear regression model regressing on age, sex and 11 independent APOE SNPs, described above, was 21.97% (P = 4.3E-31). The model with age, sex, and the 10 non-APOE SNPs (OPRD1/rs114661586, GBA3-PPARGC1A/rs142344853, PLK2/rs72758175, ZPR1/rs964184, PHF14/rs149497036, BUD13/rs180326, SIK3/rs4936359, ZPR1/rs35120633, LINC02403/rs2470341, BMP2/rs73894435) explained 9.26% (P = 4.6E-27) of the variance. Age and sex alone explained 4% (P = 8.7E-26) of the variance. The cumulative variance explained by both the APOE and non-APOE SNPs is 25.36% (P = 2.2E-32).
Gene-based association analysis
We conducted a gene-based association test using MAGMA (Multi-marker Analysis of GenoMic Annotation), which employs multiple linear regression on the full GWAS input data. The gene-wide significant threshold was set at P = 2.68E-06 (0.05/18,656 tested genes). A total of seven genes passed the gene-wide threshold, including APOE, PVRL2, TOMM40, APOC1 on chromosome 19 and ZPR1/ZNF259, APOA5, BUD13 on chromosome 11(Fig. 5). Two additional genes on chromosome 19 achieved subthreshold significance: CEACAM19 and BCAM. These results provide further credence to the single-variant analyses on chromosomes 19 and 11.
The top genes are annotated. The Red line indicates the gene-wide significant threshold of P = 2.68E-06. ZNF259 (ZPR1); PVRL2 (NECTIN2). APOE (P = 3.16E-15), PVRL2 (P = 4.44E-15), TOMM40 (P = 5.98E-12) APOC1 (P = 4.94E-10), ZPR1/ZNF259, (P = 1.21E-08), APOA5 (P = 6.13E-08), BUD13 (P = 2.12E-06), CEACAM19 (P = 3.05E-06) and BCAM (P = 6.59E-06).
Functional bioinformatics analyses
To examine the biological significance of the identified variants and genes, we used the Functional Mapping and Annotation (FUMA) web-based platform (https://fuma.ctglab.nl/) to annotate, prioritize, visualize, and interpret GWAS results. FUMA has two core processes, SNP2GENE and GENE2FUNC [22]. SNP2GENE annotates SNPs for functional consequences on gene functions using ANNOVAR, deleteriousness score (CADD score), potential regulatory functions (RegulomeDB score) and effects on gene expression using expression quantitative trait loci (eQTL) and then mapped them to genes based on their physical position on the genomes, eQTL associations and 3D chromatin interactions. GENE2FUNC annotates the identified genes in biological context (gene expression, enrichment of differentially expressed genes in certain tissues, overrepresentation of gene sets, and general biological functions of input genes in term of their reported disease associations and drug targets).
A total of 116 pre-defined SNPs with GWS or suggestive associations (Table S3) with ApoE levels were used as input to SNP2GENE that mapped to 79 coding genes, including 49 in the novel regions and 17 in the APOE region (Table S4). Of the 79 genes, 77 had unique Entrez IDs which were further annotated to identify their gene expression and possible biological roles using GENE2FUNC that identified groups or pathways enriched for these 77 genes. APOE along with other genes in this region have relatively high expression in the brain (Fig. S6). Of the novel loci, no expression data were available for PPARGC1A, GBA3, and BMP2 genes in GTEx. While the brain expression of OPRD1, PLK2, PHF14 and ZPR1/ZNF259 was modest, multiple candidate genes on chromosomes 11 and 12 showed high expression. As expected, based on the known roles of ApoE level and APOE genetic variation in lipid metabolism and AD, overrepresentation of gene sets, and general biological function analyses implicated lipid- and AD-associated pathways (Fig. S7, Table S5). Noteworthy, genes at novel loci were implicated with delirium on chromosome 11 (P = 3.15E-08) and hippocampal volume on chromosomes 11 and 12 (P = 3.50E-14; Table S5).
Association of plasma ApoE level with AD-associated SNPs
Within the APOE region, Jansen et al. [23] identified 8 independent SNPs, in addition to APOE*4 and APOE*2, to be associated with AD risk. All these SNPs showed the expected association with plasma ApoE level in our study where AD risk allele was associated with lower ApoE level and AD protective allele with higher ApoE level (Table S6). Next, we examined the top 21 non-APOE AD loci [5] with ApoE level. Since the reported top SNP in one GWAS may not be the same in another GWAS, we examined multiple SNPs around the top reported AD-associated SNP in a given region. The top SNP in each AD region in our plasma ApoE GWAS was associated at nominal significance with ApoE level (Table S7) with the strongest association observed in the ABCA7 region (P = 9.88E-05). We further checked if additional common and low-frequency variants within 17 AD-associated genes implicated by rare variants are associated with plasma ApoE level; all of them were nominal significant (Table S8). We acknowledge that some of these associations may be false-positive, but nevertheless they show a consistent pattern of association.
We also examined the ApoE level-associated novel SNPs in the APOE and non-APOE regions with AD risk and amyloid deposition in the brain (Table S9). All but two SNPs (rs35136575, rs114448690) in the APOE region were associated with AD risk in the IGAP discovery data [24]. Only 5 of the 12 SNPs in the APOE region were present in the amyloid-PET GWAS data [25] and all 5 were also associated with amyloid deposition. The non-APOE lead SNPs were not significant in the IGAP 2019 data. This is probably due to the low frequencies of all (MAF = 1–2%), but one variant. These complementary data suggest that at least a part of the AD genetic risk and amyloid deposition is mediated by genetically determined plasma ApoE variation, especially in the APOE region.