Sunday, December 3, 2023

Bioinformatics analysis of immune cell infiltration patterns and potential diagnostic markers in atherosclerosis – Scientific Reports

A database in the Gene Set Enrichment Analysis (GSEA; platform10,11 was used to identify 134 GLN metabolism-associated genes.

Weighted gene co-expression network analysis (WGCNA) and module screening

GLN-associated gene sets were investigated using WGCNA. The results demonstrated that when the weighted value was 24 (Fig. 1A), scale independence was greater than 0.85 and mean connectivity was approximately 0. Three co-expressed modules were screened and the unrelated genes were distributed to a gray module, which was excluded from subsequent analyses (Fig. 1B). The module eigengenes (MEs) were correlated, thus allowing studying the associations among modules. A dendrogram and heatmap were adopted to depict the eigengene network (Fig. 1C). To understand the physiological significance of these modules, three MEs were associated with GLN, and the most significant associations were searched based on the heatmap of module-trait correlations (Fig. 1D).

Figure 1

Establishment of Weighted Gene Co-expression Network Analysis (WGCNA). (A) Weight parameter β = 24 (soft threshold) and scale-free topology fitting index (R2). Various modules of co-expression data in atherosclerosis were identified using WGCNA. (B) Associations of modules. Top: Hierarchical clustering of module eigengenes (MEs) summarizing modules detected using clustering analysis. Branches of dendrogram groups (meta-modules) exhibited positive correlation with eigengenes. Bottom: Heatmap of adjacency relationships in the eigengene network. In the heatmap, there is correspondence between each row and column and one ME (colored), with red and blue indicating high and low adjacencies, respectively. Meta-modules refer to red squares along the diagonal. (C) Associations of consensus MEs with glutamine (GLN). In the table, there is correspondence between each row and consensus module, as well as between each column and a sample or trait. In the table, the numbers indicate the correlation coefficients for the corresponding ME and trait, and the corresponding P-values are printed in parentheses. The color legend indicates correlation coefficients. (D) Heatmap of overlapping gene network topologies. There is correspondence between each row and column and the gene in the heatmap, with light color and progressively darker red indicating low and higher overlapping topologies, respectively. Correspondence between modules and darker squares is observed along the diagonal. Dendrograms of genes and assignments of modules are exhibited on the left and top. (F) Correlation of gene significance (GS) with module membership (MM) for all GLN-associated genes in the blue module. ‘Cor’ refers to the absolute correlation coefficient of MM with GS.

DEG identification

A total of 1039 statistically significant DEGs were identified between healthy controls and atherosclerosis samples (p-adjusted < 0.05, |log2 fold change [FC]|> 0.5). Atherosclerosis samples included 619 genes with upregulated and 420 with downregulated expression. A volcano plot was generated for DEG visualization (Fig. 2A). The top 5 upregulated (WDFY4, TPP1, DAB2, CYTH4, and ADAP2) and downregulated DEGs (ATP1A2, CAB39L, BAG2, C14orf132, and PXYLP1) are shown using a heatmap (Fig. 2B). Based on the results of the Wilcoxon tests, these 10 genes showed marked differences in expression levels between atherosclerosis and control samples (p < 0.05, Fig. 2C).

Figure 2
figure 2

Identification of differentially expressed genes (DEGs). (A) Volcano plot of the distribution of DEGs in controls and atherosclerosis samples. Red dots: upregulated expression; purple dots: downregulated expression; gray dots: not significant expression. (B) Heatmap of the top 5 downregulated and upregulated DEGs. (C) The variations in the expression levels of 10 genes in both groups were revealed using Wilcoxon tests. Asterisks represent p values (*, **, ***, and **** represent p < 0.05, < 0.01, < 0.001, and < 0.0001, respectively).

We obtained 308 GLN-associated DEGs from the intersection between GLN-associated module genes and DEGs.


To explore potential mechanisms underlying DEG function, GSEA was performed. A Molecular Signatures Database (MSigDB) collection was used to select the signaling pathways with the most significant enrichment according to the normalized enrichment score (NES). GSEA revealed that LYSOSOME (NES = 2.45, P-adjusted = 0.012, FDR = 0.008), HEMATOPOIETIC CELL LINEAGE (NES = 2.41, P-adjusted = 0.012, FDR = 0.008), CYTOKINE–CYTOKINE RECEPTOR INTERACTION (NES = 2.351, P-adjusted = 0.012, FDR = 0.008), PROPANOATE METABOLISM (NES =  − 1.795, P-adjusted = 0.023, FDR = 0.014), DILATED CARDIOMYOPATHY (NES =  − 1.796, P-adjusted = 0.015, FDR = 0.009), and TYROSINE METABOLISM (NES =  − 1.798, P-adjusted = 0.014, FDR = 0.009) (Fig. 3A–F) were significantly enriched in atherosclerosis.

Figure 3
figure 3

Gene Set Enrichment Analysis (GSEA) of significantly enriched pathways. (A) Lysosome, (B) hematopoietic cell lineage, (C) cytokine–cytokine receptor interaction, (D) propanoate metabolism, (E) dilated cardiomyopathy, and (F) tyrosine metabolism. (G) Heatmap illustrating the result of the GSVA analysis. FDR false discovery rate, NES normalized enrichment score.

Gene set variation analysis (GSVA)

To further explore functional annotation, differences in the relative expression of pathways were evaluated between disease and control groups using GSVA. Numerous differentially expressed pathways were enriched and a heatmap was used for visualization. The disease group exhibited markedly lower expression of KEGG_LYSOSOME- and KEGG_HEMATOPOIETIC_CELL_LINEAGE-associated pathways, and markedly higher expression of KEGG_LIMONENE_AND_PINENE_DEGRADATION- and KEGG_BUTANOATE_METABOLISM-associated pathways than the control group (Fig. 3G).

Enrichment analyses (Gene Ontology [GO]/Kyoto Encyclopedia of Genes and Genomes [KEGG])

To investigate the biological functions of the most significant module genes associated with WGCNA and glutamine metabolism compared with differential genes in atherosclerosis and normal controls, functional enrichment analysis was performed. GO results show, The most significant WGCNA and glutamine metabolism related module gene muscle contraction, muscle system process, regulation of actin filament—based process (BP), contractile fiber, myofibril, sarcomere(CC), actin binding, actin filament binding, integrin binding(MF) enrichment (Fig. 4A); The enriched KEGG pathways included Focal adhesion, Vascular smooth muscle contraction, Regulation of actin cytoskeleton, etc. (Fig. 4B). The GO results of differential genes in atherosclerosis and normal control showed that the genes were leukocyte cell–cell adhesion, T cell activation, leukocyte migration(BP), tertiary granule, secretory granule membrane, specific granule(CC), actin binding, immune receptor activity, integrin binding(MF) enrichment (Fig. 4C); Enriched KEGG pathways include Rheumatoid arthritis, Chemokine signaling pathway, Cell adhesion molecules and so on (Fig. 4D).

Figure 4
figure 4

Enrichment analysis of the most significant module genes related to WGCNA and glutamine metabolism and the differential genes between atherosclerosis and normal controls. (A) The GO item enrichment analysis results of the most significant module genes related to WGCNA and glutamine metabolism are presented. (B) The KEGG enrichment analysis results of the most significant module genes related to WGCNA and glutamine metabolism are presented. (C) The enrichment analysis results of GO entries of differential genes between atherosclerosis and normal controls are displayed. (D) The KEGG enrichment analysis results of differential genes between atherosclerosis and normal controls were displayed.

To investigate the biological functions of GLN-associated DEGs, GO term along with KEGG pathway enrichment analyses were performed. According to results of the GO term analysis, the genes exhibited strong enrichment in muscle contraction, muscle system process, and muscle tissue development (biological process [BP]); contractile fiber, myofibril, and sarcomere (cellular component [CC]); and actin binding, structural constituent of muscle, and transmembrane transporter binding (molecular function [MF]) (Fig. 5A–D).

Figure 5
figure 5

Functional enrichment analysis of GLN-associated DEGs. (A) Gene Ontology (GO) analysis marked terms. (B) Bubble chart of biological processes (BP). (C) Chord plot of cellular components (CC). (D) Bar plot of molecular functions (MF). (E) Circle plot showing pathway enrichment using Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis of CC and pathway descriptions. FC fold change.

The enriched KEGG pathways included dilated cardiomyopathy (DCM), hypertrophic cardiomyopathy (HCM), and cAMP signaling pathway (Fig. 5E).

Protein–protein interaction (PPI) network analysis and hub gene screening

To comprehend the interactions among the GLN-associated DEGs, a PPI network was built. Twelve approaches were adopted to elucidate highly correlated genes in the PPI network. Based on the intersections of the top 100 genes, obtained using all 12 methods, we identified 27 hub genes: ACTN2, TPM2, FLNC, MYH11, ITGA7, DMD, PAK3, LMOD1, GNAI1, MYLK, PLN, CTPS1, CNN1, MYH10, NLGN1, ADAMTSL3, MYOCD, MICU3, PPP1R14A, MAP2, OGN, PDE8B, RGS5, MAP1B, ITGA9, AKAP6, and SMTN (Fig. 6).

Figure 6
figure 6

Hub genes of the protein–protein interaction (PPI) network. (A) Hub genes were screened based on gene intersections using 12 approaches.

Hub gene validation

To further verify the diagnostic value of hub gene, ROC curve was used to verify hub gene, and FLNC (AUC = 0.8287), AKAP6 (AUC = 0.8139), LMOD1 (AUC = 0.8236), DMD (AUC = 0.8394) were found. ACTN2 (AUC = 0.8588), GNAI1 (AUC = 0.8222), CTPS1 (AUC = 0.8368), MAP1B (AUC = 0.8241), ITGA7 (AUC = 0.7965), ITGA7 (AUC = 0.7965), the area under ROC curve (AUC) values of ADAMTSL3 (AUC = 0.8218) and ITGA9 (AUC = 0.8014) were both greater than 0.6 (Fig. 7A). We used tenfold cross-examination to verify the diagnostic efficacy of the AUC model, and found that the AUC value was 0.957 (Fig. 7B), indicating that the AUC model has good diagnostic efficacy, which indicates that the hub gene has the differential ability as a potential biomarker of atherosclerosis. Among them, CNN1 (AUC = 0.874 (0.801, 0.947)) has good diagnostic value. Next, we used delong test to detect whether there were statistical differences in ROC curves between CNN1 and other hub genes, and the results showed that most hub genes were significantly different from CNN1(p < 0.05), suggesting that CNN1 might be a biomarkers of atherosclerosis (Fig. 7C,D, Supplementary Fig. S1A–M).

Figure 7
figure 7

ROC curve analysis and testing of Hub gene. (A) ROC curve analysis of Hub gene. (B) 10 times cross validation ROC curve of AUC model. (C) ROC curves and delong test p-values of TGA7 and CNN1. (D) ROC curves and delong test p-values of MYH10 and CNN1.

Immune cell infiltration (ICI)

Considering that ICI may play a crucial role in the pathogenesis of atherosclerosis, we examined the correlations between atherosclerosis/control samples and infiltrated immune cells. The atherosclerosis samples exhibited a markedly higher degree of infiltration than healthy controls, for most immune cells (Fig. 7A). A positive correlation was observed between the majority of immune cells (Fig. 8B). In addition, each hub gene showed a marked correlation with the corresponding immune cells (Fig. 8C–E). Notably, significant correlations were observed between MYLK and CD56bright natural killer (NK) cells (R = 0.881, p < 0.001), SMTN and CD56bright NK cells (R =  − 0.889, p < 0.001), and MYLK and Th1 (R =  − 0.898, p < 0.001) (Fig. 8C–E).

Figure 8
figure 8

Immune cell infiltration (ICI) between atherosclerosis and control samples. (A) Heatmap of ICI changes in both groups. (B) Association among immune cells. Dot plots of the correlation between immune cells and genes MYLK (C), SMTN (D), and MYLK (E). *, **, ***, and **** represent p < 0.05, < 0.01, < 0.001, and < 0.0001, respectively.

Signaling pathways participated in signature genes

We used GSVA to examine the differences in 50 HALLMARK signaling pathways in atherosclerosis and control samples.


Figure 9
figure 9

Associations of hub genes with 50 HALLMARK signaling pathways. (A) Fifty HALLMARK signaling pathways were compared in atherosclerosis and control samples. (B) Associations between 50 HALLMARK signaling pathways and target genes. *, **, ***, and **** indicate p < 0.05, < 0.01, < 0.001, and < 0.0001, respectively.

We also analyzed the correlations between the top 5 most-significant differentially expressed hub genes and 50 HALLMARK signaling pathways. ACTN2 was associated with multiple pathways, including HALLMARK_UV_RESPONSE_UP and HALLMARK_UV_RESPONSE_UP (Fig. 9B).

Interaction analysis on hub genes

We created a PPI network for the signature genes using the GeneMANIA database and identified 27 genes (Fig. 10A). To further investigate the function of signature genes, GO and KEGG analyses were performed on 47 genes comprising 27 hub and 20 associated genes. According to the GO results, the genes were strongly enriched in regulation of axonogenesis, regulation of cell shape, and regulation of vascular smooth muscle (VSM) cell migration (BP); cell division site, cortical actin cytoskeleton, and caveola (CC); and cyclic-nucleotide phosphodiesterase activity, transcription coactivator activity, and transmembrane transporter binding (MF) (Fig. 10B). KEGG analysis revealed that DCM, VSM contraction, HCM, focal adhesion, regulation of actin cytoskeleton, arrhythmogenic right ventricular cardiomyopathy, adrenergic signaling in cardiomyocytes, cGMP-PKG signaling pathway, and oxytocin signaling pathway were the main enriched pathways (Fig. 10C).

Figure 10
figure 10

Interaction analyses of hub genes. (A) Co-expression network of signature genes. (B) GO and (C) KEGG analyses of co-expressed genes.

Validation of the gene set

The expression levels of the key genes validated enrichment for gene set muscle tissue development, muscle system process, and muscle contraction. The atherosclerosis samples displayed low expression levels for most of the key genes (Fig. 11).

Figure 11
figure 11

Heatmaps of key genes. Gene sets of muscle tissue development (A), muscle system process (B), and muscle contraction (C) in atherosclerosis and control samples.

Source link

Related Articles

Leave a Reply

Stay Connected

- Advertisement -spot_img

Latest Articles

%d bloggers like this: