Patients and samples included
We performed snRNAseq on 7 muscle samples of patients with DMD and 5 muscle samples of age and gender matched controls. Table 1 summarizes the main demographic, genetic, and clinical features of the patients included. Supplementary Fig. 1 shows representative areas of the haematoxylin-eosin (HE) staining of the muscle samples included in this study.
Characterization of cell populations identified in the skeletal muscle samples
A total of 30857 nuclei from controls and 25817 nuclei from DMD were included in the analysis. Unsupervised clustering using the Seurat package identified 19 different nuclei clusters (Supplementary Fig. 2) [24]. An analysis of the differentially expressed gene signatures allows us to attribute clusters to 10 putative identities (Fig. 1A and Supplementary Fig. 3). The largest number of nuclei in the samples were from myofibers expressing Ckm, a marker of mature myonuclei. As expected, we identified fast and slow type myofibers characterized by the expression of Myh1 and Myh2 and, Myh7b respectively, but also regenerative fibers characterized by the expression of Myh3 and Myh8 (Fig. 1 and Supplementary Fig. 4). Fast myofibers included type IIa fibers expressing Myh2 and type IIx myofibers expressing Myh1, but we just identified a minority of nuclei expressing Myh4 which is typical of type IIb myofibers. Fast myofibers cluster displayed high levels of genes encoding the fast isoforms of troponins (Tnnt3, Tpm1, and Tnni2), the sarcoplasmic reticulum-calcium ATPase 1 (SERCA1/ Atp2a1), and glycolytic enzymes (Eno3, Pfkm, Pkm and Pygm) [25]. The slow myofibers cluster displayed high levels of genes encoding the slow isoforms of troponins (Tnnt1, Tpm3, and Tnni1), slow myosin light chains (Myl2 and Myl3) and the sarcoplasmic reticulum-calcium ATPase 2 (SERCA2/ Atp2a2). Regenerative fibers were identified by the expression of Myh3 and Myh8 that encode the embryonic and perinatal MyHC isoforms respectively, also Tnnt2 that encodes for an isoform of Troponin-T expressed by cardiac muscle but also embryonic skeletal muscles [26]. As expected, regenerative fibers expressed high levels of Ncam1, involved in the reinnervation process of new fibers, and Myog, which encodes myogenin and is expressed by both fusing myoblast and newly regenerated fibers [27]. Closely located to the regenerating fibers in the UMAP, we identified a cluster of cells expressing Pax7 that was identified as satellite cells and shared many genes with regenerative fibers in our samples. This expression profile similitudes illustrated by UMAP is compatible with the origin of regenerative fibers from satellite cells and leads us to study potential gene expression profiles driving cellular transitions using pseudotime trajectories as shown in Supplementary Fig. 5. As observed cells in the earliest stages of myoblast differentiation showed expression of genes related with ribosomal and mitochondrial function such as Rpl30, Rpl37, Mt-co2 or Mt-co3, reorganization of the cytoskeleton such as Myl2, Myl1 or Acta1 and, genes promoting the differentiation of satellite cells such as Meg3 or Rassf4 [28, 29]. These were followed by myonuclei expressing genes expressed in fast fibers such as Myh1, Tpm1, Tnni2, and, finally, genes expressed in slow fibers such as Myh7, Tnnt1 orTpm3. Interestingly, we did not identify any population expressing genes related to the myotendinous junction, such as Col22a1 or Ankrd1, as reported in snRNAseq studies done with murine samples that use the whole muscle for the analysis [16, 30, 31]. Moreover, we did not observe a specific cluster expressing genes of the neuromuscular junction (NMJ) as has also been reported in mice, although the expression of Chrna1 or Chrng, was enriched in the regenerative fibers suggesting that there is an active process of remodeling of the NMJ in this cluster as has already been suggested (Supplementary Fig. 4) [11].
A UMAP visualization of all the nuclei from control and DMD samples colored by cluster identity. B Table comparing the proportion of cell population between control and DMD samples. C UMAP showing clusters identified in control (left) and DMD (samples). D Violin plots showing the expression of selected marker genes for each cluster of nuclei. FAPs fibroadipogenic progenitor cells.
We observed six other non-myofiber clusters of nuclei that included endothelial cells expressing Pecam1 and Ptprb, smooth muscle cells (SMC) expressing Acta2, Pdgfrb or Myh11 or adipocytes expressing Adipoq (Fig. 1). Inflammatory cells characterized by the expression of Ptprc/Cd45, were further divided in macrophages expressing Mrc1/Cd206 and B/T cells expressing Dok8 [32, 33]. Within the macrophages, we identified nuclei expressing M2 markers such as F13a1, Fcer2/Cd23, and Cd209 and nuclei expressing M1 markers such as Cd44, Cd86, Tlr2, and Fcgr3a [34]. Profibrotic genes, such as Tgfb1 and Spp1 were also expressed by M2 macrophages. A large cluster of nuclei was characterized by the expression of Pdgfra and Dcn which are well-known markers of fibroadipogenic progenitor cells being labeled as fibroadipogenic precursor cells (FAPs) and will be described later in detail.
Distinctive signature profile in skeletal muscle of patients with DMD
We identified significant differences in the proportion of each cell population when comparing the samples from control and DMD patients (Fig. 1B and C). Specifically, there was a significant decrease in the percentage of slow type I fibers (50.9% in controls vs 31.2% in DMD, p = 0.01, Mann–Whitney test) and SMC (3.3% vs 1.6%, p = 0.02, Mann–Whitney test) and an increase in the percentage of regenerative fibers (1.3 vs 5.7, p = 0.001, Mann–Whitney test) in DMD samples. There was also a trend for an increase in the number of satellite cells (0.9% vs 2.4%, p = 0.07) and of FAPs (5.6% vs 14.1%, p = 0.07). We validated these results using immunohistochemistry and immunofluorescence as it shown in supplementary Figs. 1 and 6. When assessing these differences between samples at the individual level, we observed that the proportion of cells in the muscle was similar among all controls, while there was greater variability in the DMD samples, which is compatible with an active process of muscle degeneration going through different stages (Fig. 2A and B). PCA analysis differentiated between DMD and controls by assessing the proportion of each cell population in the biopsy (Fig. 2C). The control patients were all closely located in the 2D dimension PCA map. Interestingly, the distribution of DMD patients on the PCA graph moved according to their clinical phenotype revealing two different subgroups, one that could be earlier in the process of muscle degeneration and included five samples from younger patients (2 to 3 years) with mildly affected muscle function and characterized by an increase in the number of nuclei from regenerative fibers (DMD-3 to DMD-7) and, another group of two samples from patients slightly older (4 years) who were already showing clear signs of muscles weakness and were more advanced in the process of muscle degeneration with a reduced number nuclei from slow and fast fibers and an increase in the nuclei from FAPs (DMD-1 and DMD-2).
Differential expression of genes in muscle fibers of patients with DMD
To gain insight in myofibers transcriptome in each cluster, we compared the gene expression profile of fast and slow muscle fibers nuclei between healthy controls and DMD patients (Fig. 3). Considering those genes with a higher log2FC > 0.5, we found 292 genes significantly upregulated and 85 downregulated in fast myofibers while in slow fibers we found 238 upregulated and 89 downregulated in DMD compared to controls. Fast and slow myofibers in DMD shared several genes in the top ten upregulated genes, such as Myh3, a characteristic marker of regenerative myofibers (log2FC = 3.1 and 2.1 respectively), Meg3, a LncRNA involved in myoblast plasticity and differentiation (log2FC = 2.3 and 2.9 respectively) and Ldb3 which acts as an adapter in skeletal muscle to couple protein kinase C-mediated signaling via its LIM domains to the cytoskeleton (log2FC = 1.6 and 1.3 respectively) (Fig. 3A–D) [35, 36]. Interestingly, we observed an upregulation of genes involved in the transport of calcium (Cacnas1, Ryr1) and also, proteases such as Capn3 and Capn2, two pathways already known to be involved in the process of muscle degeneration in DMD [37]. Gene set enrichment analysis (GSEA) revealed several dysregulated molecular pathways (Fig. 3H). We observed an enrichment in both slow and fast fibers of the expression of genes involved in myogenesis, and muscle growth compatible with an active process of muscle regeneration, but also genes involved in axon guidance suggesting that new and regenerated muscle fibers release signals for the growth of terminal axons needed for reinnervation or, genes involved in adherens junction probably due to the need of new fibers to link again to the ECM. Interestingly, DMD fibers had a reduction in several metabolic pathways, when compared with control individuals especially oxidative phosphorylation but also glycolysis and lipid transport confirming previous observations [11]. Validation of these results using staining is displayed in supplementary Fig. 7. We observed a large number of fibers expressing embryonic myosin heavy chain and TNNT2 which are markers of regenerative fibers and, NCAM, a marker expressed by denervated fibers. Moreover, ITGB1 expression was increased in DMD samples. Additionally, we observed abnormalities in SDH/COX staining in many muscle fibers of the DMD patients, supporting an abnormal mitochondrial function (Supplementary Fig. 8).
A Top molecular pathways upregulated in fast fibers. B List of the top ten genes upregulated in myonuclei of fast fibers of DMD samples. C Top molecular pathways upregulated in slow fibers. D List of the top ten genes upregulated in myonuclei of slow fibers of DMD samples. E Top molecular pathways upregulated in regenerative myofibers. F List of the top ten genes upregulated in myonuclei of regenerative myofibers. G Heatmap showing expression of genes involved in muscle growth in Control and DMD fast and slow myonuclei. H GSEA plots showing enrichment score (ES) of the significantly enriched hallmark gene sets in fast and slow myonuclei. A positive value of ES indicates enriched in DMD and a negative value indicates enriched in normal conditions and down-regulated in DMD. GSEA gene set enrichment analysis, NES normalized enrichment score, FDR false discovery rate. Oxid Oxidative. GNRH Gonadotropin hormone-releasing hormone.
In the case of regenerative myofibers, as they were just a minority in control patients (1.7%) and much more abundant in DMD (5.9%), we decided to analyze the genes increased in this population compared to slow and fast muscle fibers of patients with DMD. We found 69 upregulated and 63 downregulated genes in regenerative fibers. As expected, among the top ten upregulated genes we identified many involved in the process of muscle regeneration such as Myh3, Myh8, Tnnt2 or Cald1 involved in the regeneration of the myofibrillar system, but also genes involved in membrane fusion (Myof), neuromuscular junction development (Chrng or Macf1) or Hippo pathway (Rassf4) involved in myoblast activation and muscle growth [38, 39].
As myogenesis and cell growth were one of the main pathways upregulated in muscle fibers, we further investigated the mechanism controlling these processes [40]. We analyzed several pathways and as shown in Supplementary Fig. 9, we observed an upregulation of the MEF2 family of transcription factors that have been classically involved in the myogenic program, especially Mef2a and Mef2c, while Myf6 expression, an inhibitor of MEF2 was reduced in the muscle fibers [41]. Additionally, expression of Tead1 and Tead4, the downstream effectors of the Hippo pathway involved in muscle hypertrophy, were also upregulated in DMD muscle fibers, associated with an increase in Yap and Taz expression two cofactors of this pathway [42]. In concordance with these findings, pro-atrophic genes such as Murf1 (Trim63) and Atrogin-1 were downregulated in slow/fast muscle fibers. We did not observe an increase in the expression of genes belonging to the myostatin or insulin growth factor pathway in slow or fast myofibers.
FAPs from patients with DMD express genes related with cell proliferation and extracellular matrix remodeling
We compared the transcriptional profile of FAPs from healthy controls and DMD patients. DMD FAPs had a significant upregulation of 249 genes and a downregulation of 68 genes (log2FC > 0.5) compared to controls. Among the top upregulated genes, we found genes encoding different types of collagens (Col1a1, Col1a2, Col6a6, Col3a1, or Col21a1 among others), but also other components of the extracellular matrix such as elastin (Eln) and several fibulins. Genes encoding for proteins involved in extracellular matrix assembly such as Sned1, matrix remodeling (Adamtsl1), or interaction between cells and matrix such as laminins (Lamb1 and Lama4) were also upregulated [43]. As many of the genes produced by DMD FAPs were components of the extracellular matrix, we compared the expression levels of matrix components between control and DMD samples and observed significant differences, not only in the expression levels but also in the components identified as shown in Supplementary Fig. 7. Apart of the extracellular matrix genes, we observed an upregulation of genes involved in relevant signaling pathways such as PDGF and NCAM signaling, tyrosine kinase activation and, Rho-GTPase cycle suggesting that FAPs are not a mere producer of extracellular matrix but they could also play a role as a potential regulator of the activity of other muscle resident cells (Fig. 4B and D). Several markers of fibrosis were increased in the muscle samples of patients with DMD compared to controls as shown in Supplemental Fig. 10, including collagen I and VI, CTGF and PDGF-AA expression as well as TE7, a marker of fibroblast.
A List of the top ten genes upregulated in FAPs of DMD samples. B Top molecular pathways upregulated in DMD FAPs. C UMAP visualization of nuclei from FAPs of control and DMD individuals coloured by subpopulation identity. D Monocle analysis showing pseudotime trajectories of the re-clustered FAPs. E Heatmap showing selected gene expression across pseudotime trajectories. F Selected genes expressed in each FAP subcluster. G Population of subclusters of FAPs in Control and DMD samples.
Based on the myriad of biological processes upregulated in DMD samples, we decided to explore if there could be subpopulations of FAPs at different stages of differentiation that could be distinguished based on their gene expression profile. We re-clustered the FAP subgroup and identified seven different subclusters of cells as shown in Fig. 4C. These clusters shared the expression of many genes such as Dcn, Pdgfra, Col1a1 or Col3a1, however some genes that were preferentially expressed in some of the clusters (Fig. 4F). For example, cluster 0 that was predominant in control samples expressed higher levels of the antiproliferative genes Fos(p55) and Itih5, but also Col4a2 and Lama2. Cluster 4 was characterized by the expression of Fbn1 while cluster 5 was characterized by the expression of Lum and could confirm the existence of this population of FAPs in DMD patients recently described by Rubinstein et al. [44]. Lum+ FAPs were the ones expressing the highest levels of collagen related genes such as Col1a1 or Col3a1. Two of the clusters identified, cluster 3 and 6, were almost exclusively present in DMD samples and were characterized by the expression of genes involved in cell proliferation such as Ahnak, Ccdc102b, and Podn or Parp14, Bod1l1 or Smg1 respectively. Interestingly, cluster 6 was distinctively present in the patient that had the greatest decline in muscle function during follow up. Monocle analysis identified potential trajectories in the differentiation process of FAPs over time that started in Cluster 0, majority of controls, and end in Cluster 6 (Fig. 4D). Moreover we also identified genes differently expressed through the differentiation process (Fig. 4E).
As expected, the population of adipocytes, a type of cell known to derive from FAPs in the skeletal muscles, was higher in DMD samples than in controls. Further, we studied the molecular pathways activated in adipocytes based on their gene expression profile and observed that apart of pathways involved in lipid metabolism, adipocytes had an increased expression of genes of the Rho pathway and genes encoding for components of the basal lamina such as Col4a1, Col4a2, Lama4 and, Lamb1 (Supplementary Fig. 11).
Gene expression in different stages of disease progression
We were interested in investigating differences in the gene expression profile between DMD patients at different stages of disease progression. To do so, we reviewed the clinical and muscle function information present in the clinical notes and observed that there were consistent differences in clinical function at baseline and disease progression over the first four years after the muscle biopsy was obtained. As shown in Table 1, patients DMD 3, 5, 6, and 7 had mild muscle impairment at baseline and showed either stabilization or improvement in muscle function during follow-up period. On the other hand, patients DMD 1 and 2, showed a worse baseline performance and a decline in muscle function during the follow up period [45]. We explored if there were differences between control samples (Group A), stable patients (Group B) and declining patients (Group C) in the gene expression profile of muscle fibers and FAPs (Fig. 5). Muscle fibers from controls were enriched in the expression of genes such as Pdk4 and Txnip involved in the metabolism of glucose and lipids, Linc-Pint and Btg2 inhibiting cell division and, Trim63 (Murf1) involved in protein ubiquitination [46,47,48]. In the case of DMD patients, we did not observe many differences between Group B and C in the upregulated genes of, that were predominantly involved in muscle regeneration, either on satellite cell activation, membrane fusion, or sarcomere assembly (Meg3, Meg8, Myh3, Cald1, Igfn1, Myof or Myo18B). However, when we analyzed gene expression of FAPs among the three groups we did observe interesting results. As previously mentioned, control FAPs had a statically significant upregulation in the expression of antimitotic genes, such as Fos or Btg2 compared to DMD FAPS. FAPs from Group B (DMD stable patients) had a statically significant upregulated expression of the proapoptotic gene Xaf1, interferon induced genes such as Ifi44l or Mx2 or the profibrotic differentiation transcription factor Spry1 while, FAPs from Group C (DMD declining patients) had the highest expression of collagen genes (Col1a1, Col1a2, Col3a1) but also high expression of genes actively involved cell division (Ccdc80 or Ccdc102b), indicating that in the declining patients FAPs actively proliferate and express EXM components replacing the muscle fibers lost.
A UMAP of the subpopulations of myonuclei in control, stable, and declining DMD patients. B Heatmap showing the top upregulated genes in myonuclei from controls, stable and declining DMD patients samples. C Violin plot showing the expression of selected markers genes for myonuclei of controls, stable and declining DMD patients. C UMAP of the subpopulations of FAPs in control, stable, and declining DMD patients. B Heatmap showing the top upregulated genes in FAPs from controls, stable and declining DMD patients samples. C Violin plot showing the expression of selected markers genes for FAPs of controls, stable and declining DMD patients.
Communication between cells populations is dysregulated in DMD muscles
We studied the predicted intercellular communications of each cell population and compared the communication network between the control and DMD using CellChat package [49, 50]. The analysis revealed significant differences in the number of cell interactions. As shown in Fig. 6A, B, FAPs and satellite cells became the most important source of ligands in DMD, potentially interacting with all other cell populations. Adipocytes, which were mainly present in DMD samples, played also an important role in cell-to-cell communication in DMD samples. CellChat detected 55 significant ligand-receptor pairs in the Control dataset and 61 in the DMD samples among the 11 nuclei clusters (Fig. 6C, D). A number of molecular pathways were identified exclusively in DMD samples such as cadherins (CDH), NCAM, major histocompatibility class- I, and neuroregulin, while others were exclusively identified in Control samples including CD40, CD80 or IL-2 among others. Signaling pathways upregulated in DMD samples were involved in several processes such as cell migration and remodeling of extracellular matrix (FGF, Collagen, Laminin), nerve growth and reinnervation (NCAM, NGF, and NPR2), and inflammation (MHC-I, CXCL, THBS). A detailed analysis of the expression levels of those ligand-receptor pairs that showed more significant differences between DMD and control samples can be found in Supplemental Fig. 12. As FAPs were identified as the main producer of ligands outgoing to other cell populations both in control and DMD (Fig. 6E, F) we decided to investigate further the main molecular signals released by these cells (Fig. 6G). Collagens and laminins were the most upregulated molecules signals produced by FAPs in DMD, followed by others such as members of the PDGF and FGF family, but also tenascin, thrombospondin, and fibronectin. As collagen and laminin were components of the ECM and could potentially influence all muscle cell behavior, we decided to study more precisely their potential communication network. Network centrality analysis of the inferred collagen identified FAPs as the most prominent sources of collagen either in control and in DMD samples, acting onto endothelial and smooth muscle cells in control, but also onto adipocytes in DMD (Fig. 7). Notably, among all known ligand-receptor pairs, DMD collagen signaling was mainly dominated by collagen I, IV and VI and its receptor Itga1/Itga2 + Itgb1. FAPs were also the most prominent source of laminin either in control and DMD samples, although adipocytes became an important source as well in DMD (Fig. 8). In control samples, the laminin pathway was dominated by the Lama2 and Lamb1 ligands and its Itga1/Itgb1 and Itga7/Itgb1 receptors on endothelial and smooth muscles cells. In DMD, Lama4 and Lamb1 predominated acting through multiple Itga/Itgb receptors on adipocytes, regenerative fibers, and adipocytes in addition to smooth and endothelial cells.
A Heatmap showing differential number of interactions between clusters in DMD samples compared to controls. Red: increased interactions in DMD. Blue: Increased interactions in controls. B Chord plot displaying intercellular ligand-receptor interaction strength comparing DMD and control samples. Red: increased interactions in DMD. Blue: Increased interactions in controls. C Bar graph showing relative information flow per each signaling path in control(red) and DMD (green) samples. D Bar graph showing the weight of each signaling path in control (red) and DMD (green) samples. E Dot plot showing the weight of each cell cluster in outgoing-incoming signaling in control samples. F Dot plot showing the weight of each cell cluster in outgoing-incoming signaling in control samples. G Dot plot showing the main molecules released and received by FAPs in DMD and control samples.
A Hierarchical plot shows the inferred intercellular communication network for collagen signaling. This plot consists of two parts: Left and right portions highlight the autocrine and paracrine signaling to FAPs, regenerative fibers, satellite cells, and adipocytes and to the other clusters identified, respectively. Solid and open circles represent the source and target, respectively. Circle sizes are proportional to the number of cells in each cell group and edge width represents the communication probability. Edge colors are consistent with the signaling source. B Heatmap shows the relative importance of each cell group based on the computed network centrality measures of the collagen signaling network in control samples. C Hierarchical plot shows the inferred intercellular communication network for collagen signaling in DMD samples. D Heatmap shows the relative importance of each cell group based on the computed network centrality measures of the collagen signaling network in control samples. E Chord plot displaying intercellular communication network for collagen signaling in controls. F Chord plot displaying intercellular communication network for collagen signaling in DMD. G Relative contribution of each ligand-receptor pair to the overall communication network of a collagen signaling pathway in control samples, which is the ratio of the total communication probability of the inferred network of each ligand-receptor pair to that of the collagen signaling pathway. H Relative contribution of each ligand-receptor pair to the overall communication network of a collagen signaling pathway in DMD samples. I Violin plot showing the expression distribution of signaling genes involved in the inferred collagen signaling.
A Hierarchical plot shows the inferred intercellular communication network for laminin signaling. This plot consists of two parts: Left and right portions highlight the autocrine and paracrine signaling to FAPs, regenerative fibers, satellite cells, and adipocytes and to the other clusters identified, respectively. Solid and open circles represent the source and target, respectively. Circle sizes are proportional to the number of cells in each cell group and edge width represents the communication probability. Edge colors are consistent with the signaling source. B Heatmap shows the relative importance of each cell group based on the computed network centrality measures of the laminin signaling network in control samples. C The hierarchical plot shows the inferred intercellular communication network for laminin signaling in DMD samples. D Heatmap shows the relative importance of each cell group based on the computed network centrality measures of the laminin signaling network in control samples. E Chord plot displaying intercellular communication network for laminin signaling in controls. F Chord plot displaying intercellular communication network for laminin signaling in DMD. G Relative contribution of each ligand-receptor pair to the overall communication network of a laminin signalling pathway in control samples, which is the ratio of the total communication probability of the inferred network of each ligand-receptor pair to that of the laminin pathway. H Relative contribution of each ligand-receptor pair to the overall communication network of a laminin signaling pathway in DMD samples. I Violin plot showing the expression distribution of signaling genes involved in the inferred laminin signaling.