Primary cohort characteristics
In total, we identified 17,860 patients with a CKD diagnosis at Cedars-Sinai Medical Center (7.8% of the total patient sample), among which 7816 had an ECG taken within a 1-year window of CKD diagnosis. Our primary cohort consisted of a total of 247,655 ECGs, of which 221,974 were randomized to the training set (for both training and validation) and 25,681 to the testing set. Of the primary cohort ECGs, 74.3% had no serum creatinine or eGFR estimation within 30 days and 50.7% of ECGs had no serum creatinine or eGFR estimation at any point in the EHR, however this does not capture outside hospital or paper clinic records of laboratory testing that might have been used in the diagnosis of CKD. The mean age of the primary cohort was 61.3 ± 19.7 years and 48% were female. Demographic and clinical characteristics are presented in Table 1. Demographics and clinical characteristics according to age group are presented in Supplementary Table 2.
Model performance in the primary cohort
Our 12-lead ECG-based model achieved discrimination of any stage CKD with an AUC of 0.767 (95% CI 0.76–0.773). The model performance was consistent across the range of CKD stage, with our model achieving an AUC of 0.753 (0.735–0.770) in discriminating mild CKD, AUC of 0.759 (0.750–0.767) in discriminating moderate-severe CKD, and AUC of 0.783 (0.773–0.793) in discriminating ESRD. In all cases, negative examples were defined as ECGs without CKD diagnoses.
Given the increased prevalence of wearable technologies, particularly devices that include single lead ECG information, we trained an additional deep learning model with information from only single lead ECG information to simulate the DLA’s performance with single-lead wearable information. With 1-lead ECG waveform data, DLA achieved an AUC of 0.744 (0.737–0.751) in detecting any stage CKD, with sensitivity and specificity of 0.723 (0.723–0.723) and 0.643 (0.643–0.643), respectively.
Since early detection of CKD is crucial to prevent disease progression and complications in older age, we tested the performance of our model in younger patients (<60 years of age). 12-lead and 1-lead ECG-based DLAs were able to detect any stage CKD with AUCs of 0.843 (0.836–0.852) and 0.824 (0.815–0.832) among patients under 60 years of age, respectively.
We also tested the performance of our model separately among diabetic, hypertensive, older patients, who are generally considered as high-risk subgroups. 12-lead based model detected CKD with an AUC of 0.747 (0.707–0.783) among diabetic patients, an AUC of 0.711 (0.696–0.725) among patients with hypertension, and an AUC of 0.706 (0.697–0.716) among patients greater than 60 years old. When the model was trained with 12-lead ECG waveform, age, sex, diabetes, and hypertension, the model achieved similar discrimination of any stage CKD in the held-out test set with an AUC of 0.79 (0.781–0.798). Detailed results for 1-lead and 12-lead ECG-based DLA performance in the held-out test set are presented in Tables 2 and 3, while AUC curves are illustrated in Supplementary Fig. 1.
The model performed similarly in detecting CKD in subset populations of patients with albuminuria, patients with corresponding laboratory testing and documented eGFR, and in both ambulatory and in-hospital patients (Supplementary Table 3). In patients with both a CKD diagnosis and eGFR estimated to be less than 60 mL/min, the AUC was 0.754 (0.737–0.771), and this performance was similar in patients with hyperkalemia with an AUC of 0.741 (0.698–0.787) and without hyperkalemia with an AUC of 0.758 (0.747–0.768). The model also performed well in patients with known albuminuria, with an AUC of 0.734 (0.723–0.745) and had similar performance regardless of the positive to negative ratio in the training set (Supplementary Table 4).
Electrocardiographic features in CKD
To understand the key features of relevance for our deep learning model to be able to detect CKD, we performed two sets of experiments to evaluate the ECG parameters that are important for identifying CKD. We found statistically significant differences in all available ECG variables (heart rate, PR interval, P wave duration, QRS duration, QTc interval, P-wave axis, R-wave axis, T-wave axis) between CKD stages (Supplementary Table 5).
Secondly, we used LIME to identify which ECG segments were particularly used in the identification of CKD. Supplementary Fig. 2 shows examples of LIME-highlighted ECG segments in 12-lead and 1-lead ECG waveforms taken from correctly recognized CKD and healthy control patients in the held-out test set. In both examples, the LIME-highlighted ECG features focused mostly on QRS complexes and PR intervals. In addition, QRS complexes and PR intervals in limb leads were most frequently highlighted, potentially denoting CKD-associated electrophysiologic alterations.
External validation cohort characteristics
The external validation cohort consisted of a total of 896,620 ECGs among 312,145 patients. The prevalence of mild CKD was 1.2% while 3.6% had moderate-severe CKD, and 0.9% had ESRD. The mean age of the external validation cohort was 56.7 ± 18.7 years and 50.4% were female. The proportion of Caucasians was 47.5%, while 3.6% were black, 12.3% were Asians, and 36.6% had other or unknown race. Demographic and clinical characteristics are presented in Table 1.
Model performance in the external validation dataset
In the external validation dataset, our 12-lead and 1-lead models’ performances were comparable to the primary cohort. 12-lead ECG-based model achieved an AUC of 0.709 (0.708-0.710) in discriminating any stage CKD. 1-lead ECG-based model detected any stage CKD with an AUC of 0.701 (0.700–0.702).
Consistent with the primary cohort in which our model achieved higher CKD detection accuracy among younger patients, 12-lead and 1-lead ECG-based models achieved AUCs of 0.784 (0.782–0.786) and 0.777 (0.775–0.779) in detecting any stage CKD among subjects under 60 years of age, respectively. Detailed results for 1-lead and 12-lead ECG-based DLA performance in the external validation cohort are presented in Supplementary Tables 6 and 7.