Base editors cause unintended indels at the target sites
Several types of evolved base editors based on the CRISPR system have been developed for more accurate and efficient genome engineering21,37. Among these, AncBE4max and ABEmax were evolved by modifying codon usage, NLSs, and ancestral deaminase reconstructions21. These modifications greatly improve the efficiencies of these base editors in the cells, enabling effective SNP corrections21.
To determine whether various previously reported base editors (BE3, AncBE4max, ABE, and ABEmax) induce unwanted mutations in the target sequences, we selected four different human target genes and identified their base editing and indel efficiencies at the cellular level (Fig. 1a–f). As reported in previous studies, the nucleotide conversion efficiencies of enhanced AncBE4max and ABEmax were much higher than those of the other base editors (Fig. 1a, c). Among the CBE variants, AncBE4max exhibited the highest substitution efficiency of up to 82.2% at the HEK3 target site while inducing lower indel efficiency than BE3 (Fig. 1a, b). In particular, indels were observed at frequencies up to 6.5% and 3.2% at the HEK3 target sites for BE3 and AncBE4max, respectively, and most indels occurred at or near the cleavage site caused by SpCas9 (Fig. 1b, e). In the ABE system, ABEmax generally showed higher mutation levels of substitutions and indels than ABE (Fig. 1c, d). The ABE variants generally showed fewer indels relative to substitution efficiency than the CBE variants but still produced indels at the Cas9/sgRNA-induced cleavage sites (Fig. 1d, f). Our results demonstrate that the previously reported base editing systems induce multiple unintended indels at the DNA target sites.
nCas9 results in indels at the target sites
We found that nCas9-based base editors induce certain numbers of indels at the target sites. According to some previous studies, DNA single-strand breaks (SSBs) can cause indel mutations via repair processes38,39,40. Therefore, we hypothesized that these unintended indels at the target sites caused by the base editors were generated by nCas9 (D10A). To determine whether the indels were indeed induced by nCas9, nCas9/sgRNA-targeting ARG1, LRP5, ADAMTS4, EIF3D, MYOCD, HEK3, or HBB-E2 was delivered to human HEK293T cells, and the frequency of indels in the target sequences was analyzed. The results confirmed that up to 7.86% of indels occurred in the target sequences in the nCas9-treated group and that they had the same patterns as those induced by wild-type Cas9 (Fig. 2a–e). We found that nCas9 caused levels of indel mutations similar to those caused by the base editors at the desired target sites. This result suggests that most unintended indel mutations at the target sites were induced by nCas9 of the base editor.
Unintended indels caused by base editors can be removed with the use of dCas9
To remove unwanted indels at the target site, nCas9 in the base editor was replaced with catalytically inactivated dCas9 (D10A and H840A) (Fig. 3a–d). Consequently, the unintended indels were mostly removed from all targets in both the CBE and ABE variants using dCas9 (dBE3, AncdBE4max, dABE, and dABEmax) (Fig. 3b, d). However, as described in a previous study21, the nucleotide conversion efficiency of C-to-T or A-to-G was also reduced simultaneously in all the targets (Fig. 3a, c). These data indicate that substitution of nCas9 with dCas9 could prevent unintended indels at the target sites; however, the reduction in the base editing efficiency caused by the use of dCas9 must also be addressed.
Use of CMPs in dCas9-based base editing systems can improve substitution efficiency without causing indels
To improve the editing efficiency reduced by dCas9, we applied CMPs to the base editing system. CMPs can improve editing efficiency by unwinding the closed chromatin structures of the target sites41. Indeed, our previous study revealed that CMPs can open the closed chromatin structures of the target sites and improve the editing efficiencies in Prime editor42. We utilized human-derived high-mobility group nucleosome-binding domain 1 (HN1) and histone H1 central globular domain (H1G) as the CMP domains for dCas9-carrying base editors. To find the optimal locations of the CMP domains, HN1 and H1G were placed at various locations in the base editors, and their efficiencies were determined (Supplementary Fig. 1a). dBP2b was the most effective among the CMP-introduced CBE variants (dBP1a, dBP1b, dBP2a, and dBP2b), with a base editing efficiency comparable to or slightly lower than that of AncBE4max and complete elimination of unwanted indels (Fig. 4a–f).
Next, we generated CMP-introduced ABE variants (dAP1a, dAP1b, dAP2a, and dAP2b) using the same strategy as that for CBE using dCas9 and CMPs (Supplementary Fig. 1b). Similarly, none of the CMP-introduced ABE variants induced indels compared to ABE or ABEmax, and dAP1b showed the highest editing efficiency among CMP-introduced ABE variants (Fig. 4g–l). These CMP-introduced CBE or ABE variants were applied to 23 or 21 human target sites, respectively, to exclude targeting bias, and the ABE variants rarely induced indels compared to the CBE variants in most targets (Fig. m, n, p, q). The CMP-introduced CBE variants did not induce indels, except for one target in dBP1b (HFE; 0.9%). In contrast, BE3 or AncBE4max consistently generated indel mutations of up to 7.9% at the intended locations. Among the CMP-introduced CBE variants, dBP2b showed an average 3.0-fold increase in base editing efficiency compared with that of AncdBE4max; in particular, the efficiency improved by up to 28.4-fold at the POU5F1 target (Fig. 4o). For dAP1b, the editing efficiency increased by an average of 8.8-fold compared with that of dABEmax, particularly with an up to 112.5-fold increase at the ADAMTS4 target site (Fig. 4r). Although the CBE and ABE variants could not fully outperform their improved AncBE4max and ABEmax with regard to base editing efficiency, most of the unwanted indel mutations in the target sequences were eliminated. In most targets, neither ABE nor ABEmax induced as many indels as BE3 or AncBE4max (Fig. 4n, q). Additionally, the base editing efficiencies of dAP1a and dAP1b were lower than that of ABE or ABEmax but higher than that of dABEmax (Fig. 4q, r). As ABEmax exhibits high editing efficiency and low indel frequency, we tested whether introducing nCas9 instead of dCas9 into dAP1a (changes from dAP1a to nAP1a with the use of nCas9) and dAP1b (changes from dAP1b to nAP1b with the use of nCas9) would result in fewer indels and increased base editing efficiency compared to ABEmax (Supplementary Fig. 2a–c). Both nAP1a and nAP1b exhibited increased A-to-G substitution efficiencies owing to the use of nCas9 (Supplementary Fig. 2a). Specifically, the substitution efficiency of nAP1b was enhanced by up to 493.2-fold in 18 human gene targets compared with that of ABEmax (Supplementary Fig. 2c). However, indels also increased in three targets of nAP1a and four targets of nAP1b (Supplementary Fig. 2b). Collectively, our results suggest that using dCas9 and CMPs in base editing systems is necessary to achieve precise nucleotide conversions without undesirable indels.
ABE8e is one of the most recently evolved ABEs, and ABE8e, which maximizes editing efficiency, can induce many indels as well as high A-to-G substitution efficiency31,32. To eliminate unintended indels while maintaining high A-to-G substitution efficiency, we applied dCas9 and CMPs to ABE8e (dAP1b8e) (Supplementary Fig. 3). Our results showed that dAP1b8e produces an ~10–20% lower A-to-G conversion efficiency than ABE8e in five different human targets, but it is able to eliminate indel mutations (Supplementary Fig. 4a–j).
Since dBP2b and dAP1b8e have the highest nucleotide conversion efficiencies among the CMP base editor variants, we compared them with non-CMP conjugated base editors at eight targets in mouse cells. dBP2b increased the editing effect by an average of 2.0-fold (up to 3.1-fold) more than AncdBE4max, and dAP1b8e increased the effect by 1.4-fold (up to 1.7-fold) more than dABE8e without generating indels (Fig. 5). These data suggest that the dead base editor with CMPs generally works similarly in mice and humans.
dBP2b and dAP1b8e efficiently induce base editing without unwanted indels at the target sequences in mouse primary myoblasts
To evaluate whether the base editing frequency could be improved without introducing unintended indel mutations, we tested the optimized dBP2b and dAP1b8e variants in mouse primary myoblast cells (Fig. 6a). We first designed a stop codon (CAG > TAG, Q871*) by C-to-T conversion at the end of mouse Dystropin (Dmd) exon 20 using CBE variants (BE3, AncBE4max, AncdBE4max, or dBP2b) and a Dmd-targeting sgRNA (Fig. 6b)30. By comparing the C-to-T conversion efficiencies of the CBE variants in the mouse myoblast line C2C12, it was found that dBP2b exhibited significantly higher editing efficiency than AncBE4max without causing indels (Fig. 6c, d). We also compared the A-to-G conversion and indel frequencies among ABE8e, dABE8e, and dAP1b8e by applying a strategy to restore the Dmd KO to normal in Dmd+/Q871* NIH3T3 cells. The Dmd+/Q871* NIH3T3 cell line has a C-to-T conversion at one allele in exon 20 of the mouse Dmd gene, resulting in a stop codon (Fig. 6b). We found that dAP1b8e induced A-to-G transitions more efficiently than ABE8e without any indel mutations in the Dmd+/Q871* cells (Fig. 6e, f). Interestingly, both dBP2b and dAP1b8e showed higher base substitution efficiencies than the other CBE or ABE variants in mouse cells.
Then, as an alternative to in vivo experiments, we assessed the nucleotide substitution and indel efficiencies of the dBP2b and dAP1b8e variants using mouse primary myoblast cells. These cells were isolated from neonatal skeletal muscle of wild-type (WT) or Dmd Q871* KO mice within 5 days of birth (Supplementary Fig. 5). To verify that the desired base editing can be achieved accurately without unintended insertions or deletions even after long-term expression, we expressed the plasmids of dBP2b and dAP1b8e—the most optimized base editor variants in primary myoblasts—for up to 10 days (Fig. 6g; Supplementary Fig. 6). All CBE variants (AncBE4max, AncdBE4max, and dBP2b) gradually exhibited increasing C-to-T conversion efficiencies over time. Notably, dBP2b demonstrated the highest base editing efficiency, reaching up to 36.5% on day 10, without any unintended indels at the target sites (Fig. 6h, i). Similarly, dAP1b8e also showed a higher A-to-G substitution efficiency (4.3%) than ABE8e or dABE8e, and none of the ABE variants induced any indels (Fig. 6j, k). These results demonstrate that dBP2b and dAP1b8e are highly effective for achieving accurate and precise base editing without unintended indel mutations in primary myoblasts, even after long-term expression.
dBP2b and dAP1b8e can induce low off-target effects
To evaluate the off-target effects of dBP2b and dAP1b8e on the DNA and mRNA levels, we identified 29 potential off-target candidates (OT1-OT29) by the sgRNA of the Dmd target with up to three nucleotide mismatches in the mouse genome using Cas9-OFFinder (www.rgenome.net) (Fig. 7a–d; Supplementary Table 5). We then evaluated the off-target effects of dBP2b and dABE8e at both the DNA and mRNA levels. We found that dBP2b did not induce any off-target effects related to C-to-T conversions and indels at the off-target candidate sites at the DNA level. One of the potential off-target sites, OT29, was present in the exon of the Gpm6b gene. We performed targeted deep sequencing after cDNA synthesis to verify the off-target effects at the mRNA level, and dBP2b showed no off-target effects at the mRNA level (Fig. 7a, c; Supplementary Fig. 7).
dAP1b8e showed no off-target effects at the mRNA level for OT29 present in the exon of the Gpm6b gene among the potential off-target candidate sites (Fig. 7d). However, at the DNA level, A-to-G conversions and indel mutations occurred at two off-target sites (OT5 and OT6) located in the intergenic regions with a low efficiency of <1% (Fig. 7b; Supplementary Fig. 8). In summary, our results indicate that dBP2b and dAP1b8e can achieve precise base editing without indels at the desired target sites and can induce fewer off-target effects than other existing base editors even when expressed over a long time, increasing their feasibility for clinical applications.