Study selection
The process of study identification, screening, and inclusion is displayed in the PRISMA flow diagram in Fig. 1. From 3056 studies (2503 after duplicate removal) identified through database searches, 14 studies evaluated the value for money of either website26,27,28,29,30, text-messaging27,31,32,33,34,35,36, or smartphone application interventions29,37,38,39. An overview of study characteristics, intervention details, and health economic outcomes can be found in Table 1. Supplementary note 1 shows excluded studies at full text screening with reasons. Augustovski et al.31 and Zhang et al.36 reported on the same trial but the former was a trial-based analysis while the latter was model-based to extrapolate costs and effects on the long-term.
Study characteristics
Included studies reflected a broad geographic distribution with one study conducted in North America26, one in Central America32, two in South America31,36, four in East Asia34,37,38,39, one in South Asia33, two in the Middle East30,35, and three in Europe27,28,29. Eight studies included people with T2DM27,29,30,32,33,35,37,39, two included people with prediabetes26,34, and four studies focused on people with hypertension28,31,36,38.
Five studies were within-trial analyses with a time horizon between 6 and 18 months and a public healthcare system perspective27,28,31,33,39, while one was a retrospective matched cohort study applying similar analytics35. The within-trial analysis of Derakshandeh-Rishehri et al.30 applied a patient perspective but this is disputable. One study used a decision tree-based model with a time horizon of 6 months and a patient perspective38. Five studies used a Markov model to estimate long-term (i.e., 10 years to lifetime) costs and effects based on clinical trial inputs, and whereby three applied a public healthcare system perspective29,32,34,36 and one a healthcare payer perspective26. Finally, there was one Markov-model study which did not directly stem from one particular implementation study (i.e., all input parameters were literature driven) and which applied a 20-year horizon37. All studies with a time horizon of more than 1 year applied discount rates for both future costs and health outcomes between 3 and 5%26,31,32,34,36,37.
Interventions
Four studies evaluated the use of smartphone applications, one in people with hypertension38 and three in people with T2DM29,37,39. Smartphone applications were used for monitoring, treatment adaptation, and communication between patients and healthcare professionals (in Tsuji et al.37, also for communication with family). The smartphone applications in Li et al.39 and Cunningham et al.29 were also used for patient education. The smartphone application in Zhang et al.38 included a health agenda (i.e., reminders for follow-up). None of the four studies on smartphone applications included non-digital intervention features.
Seven text-messaging27,31,32,33,34,35,36 and five website-based studies26,27,28,29,30 were included. One intervention combined text-messaging and websites27, while five other interventions also comprised (non-)digital health modalities such as the implementation of a case manager or teleconsultation26,28,31,32,35,36. Text-messaging was used to encourage the adoption of healthier lifestyle behaviours by participants. The length of the intervention ranged from 16 weeks to 2 years, and the frequency of text messages could be as high as daily but it was not always reported. The website-based intervention component consisted of educational web pages and social network support groups, often in addition to teleconsultation, face-to-face follow-up, and/or telemonitoring.
Interventions were compared to care as usual26,29,30,31,33,35,36,37,38 or an enhanced version of care as usual (comprising self-management training, education, and/or physician training)27,28,32,38,39.
Health outcomes
Ten studies reported on the cost per quality adjusted life year (QALY) as the primary health economic outcome27,28,29,31,32,33,34,36,37,38. Some studies reported clinical outcomes such as systolic blood pressure reduction28,31, HbAc1 reduction30,33,35, proportion of population reaching hypertension31 or glycemic control39, life years gained34, and points gained on the problem areas in diabetes control (PAID) scale27. The cost-minimisation study of Chen et al.26 reported on the return on investment.
Quality appraisal
Table 2 shows the critical appraisal of selected studies for the evaluation of their quality. More than half of the included studies did not provide sufficient detail on the comparative alternatives (i.e., what does care as usual actually mean). Nine studies did not describe important costing aspects such as how the costs were measured or the sources of cost valuation26,28,29,30,33,35,37,38,39. A rather short time horizon was applied in more than half of the studies27,28,30,31,33,35,38,39 despite a long-time horizon being recommended in evaluating cost-effectiveness of chronic diseases to capture all relevant costs and effects. Moreover, all but one study27 did not provide sufficient argumentation for choosing another perspective to the societal one. Finally, only six studies reported both probabilistic sensitivity results plus another kind of sensitivity analysis such as threshold analysis or one-way sensitivity analysis on top of the point estimate results27,28,29,31,32,36.
Data synthesis
Among the studies expressing results in QALYs, the ICURs varied between dominant (i.e., less costly and better health outcomes) and €75,233/QALY, with a median of €3840/QALY (interquartile range €16,179). One study did not find a QALY difference (Fig. 2). None of the three digital health intervention modes was associated with substantially better cost-effectiveness results than the others. Four out of fourteen studies (one on text messaging, two on mainly smartphone applications, and one on website-based education) reported cost-saving results26,29,34,39.
Note that McManus et al.28 did not calculate an ICUR as QALY difference was insignificant. ICUR estimates in Cunningham et al.29 and Wong et al.34 were dominant. CG control group. *: asterisk denotes studies targeting populations with hypertension; studies without an asterisk include people with (pre)diabetes.
Smartphone applications were appraised by the studies’ authors as cost-effective37,38 or dominant29,39 compared to usual care. However, the cost-effective results in Tsuji et al.37 were associated with considerable uncertainty and should be confirmed by future trial data, as effectiveness data were simulated and the prediction model had been built on major assumptions. Li et al.39 did not report uncertainty analyses. Furthermore, the smartphone application in Zhang et al.38 was reported as not cost-effective compared to a self-management intervention: QALY gain was higher but at a considerable cost: a self-management strategy appeared to be the preferred strategy from a health economic perspective (Fig. 2).
Text-messaging alone, or in combination with other intervention aspects (such as teleconsultation, telemonitoring, case management), was found to be cost-effective27,31,32,33,35,36 or even cost-saving34. Although QALY gains were limited (ranging from a 0.01 increment per target person after 6 months in Islam et al.33 to a 0.22 increment per target person taking a lifetime horizon in Gilmer et al.32), the ICUR appeared to be robust in probabilistic sensitivity analysis27,31,32,36. This can be related to the low intervention costs since Islam et al.33 demonstrated that programme costs could at least be doubled while remaining cost-effective. Wong et al.34 even calculated that programme costs could be 50 times greater before the break-even point would be reached. Moreover, Li et al.27 argued that the health economic results of text-messaging can be even further improved by upscaling so that the cost per person decreases. Importantly, the ICUR in Gilmer et al.32 turned cost-effective only after 10–20 years, which was inconsistent with other studies that demonstrated cost-effectiveness in the short term27,31,33.
Website-based interventions appeared to be cost-effective27,28,30, dominant29, or cost-saving26, even though only a natural effect (i.e., a reduction in systolic blood pressure; the incremental number of QALYs was not significant) was found in the study by McManus et al.28. Yet scenario analysis, in which the intervention effect partly faded away, and probabilistic sensitivity analyses showed the results to be robust at given thresholds26,27,28,29.
Sensitivity and subgroup analyses were limited in most studies, which restricts the identification of cost-effectiveness drivers. First, Augustovski et al.31 reported on patient baseline characteristics: the intervention appeared to have greater value for money in populations of younger age, subjects with higher cardiovascular risk, higher body mass index, and women. The gender difference has been reported by Cunningham et al.29 as well.
Secondly, intervention aspects influenced the ICER/ICUR as well. A less intensive so less costly intervention following lower treatment adherence was reported by Augustovski et al.31, thus being indicative of better cost-effectiveness although the observed differences were not statistically significant. Meanwhile drop-out rates did not impact the ICER/ICUR in Wong et al.34. Costs were important drivers of cost-effectiveness in other studies as well33,38.
Third, modelling assumptions was the third and most investigated pillar of what drives cost-effectiveness results. The value for money improved with longer time horizons26,32, and the impact of transition probabilities, utility values, and discount rate on the ICUR were mixed34,37,38.
Whether digital health interventions targeting (pre)T2DM versus hypertension populations resulted in different cost-effectiveness outcomes, is difficult to assess because only three studies targeted populations with hypertension. However, it seems that digital health interventions targeting (pre)T2DM populations showed consistently positive cost-effective results26,27,29,30,32,33,34,35,37,39, while cost-effectiveness results in hypertension populations were more mixed28,31,36,38.
Whereas six studies evaluated one particular digital health mode, there were two studies that combined two of the digital health modes under investigation27,29, two studies where the digital health mode was part of a broader digital intervention including telemonitoring26,35, and four studies (three interventions) where the digital health mode was part of a broader health system intervention including digital and non-digital components28,31,32,36. Website-interventions, text messaging, and smartphone applications were complemented by, or were seen as a complement of, other intervention components in four out of five, four out of six, and one out of four times, respectively. Gilmer et al.32 and Zhang et al.36 evaluated two of the broader health system interventions and found relatively higher health effects (0.22 and 0.13 QALYs, respectively) compared to stand-alone interventions. Note that these two studies applied a long-term perspective, contrary to McManus et al.28 who evaluated a broad health system intervention and who found only a systolic blood pressure reduction on the short-term but no QALY improvement.