Wednesday, June 7, 2023

Effects of a virtual voice-based coach delivering problem-solving treatment on emotional distress and brain function: a pilot RCT in depression and anxiety – Translational Psychiatry

The Institutional Review Board for the University of Illinois Chicago (UIC) approved the study. All participants provided written consent. The study was registered on (NCT# 04524104).


Enrollment followed a multi-step process (Fig. 1). Participants were recruited between April 5, 2021, and October 7, 2021, from the outpatient care clinics at the University of Illinois Hospital and Health Sciences System (UI Health) and employee email listservs at UIC, a minority-serving institution.

Fig. 1: Consort chart.

Flowchart regarding the enrollment and randomization of participants.

Adults were deemed eligible if they had a Patient Health Questionnaire-9 (PHQ-9) score of 10–19 and/or a Generalized Anxiety Disorder Scale (GAD-7) score of 10–14, without serious medical or psychiatric comorbidities or other exclusions (see Supplementary Material, Section A; also see full protocol in Supplementary Material, Section H). Participants were asked to self-identify their race and ethnicity based on fixed categories to comply with National Institutes of Health’s reporting requirements.

Participants were compensated for this study. As part of this study, participants made two visits for neuroimaging (baseline [visit 1], and at 16 weeks [visit 2]). Upon completion of visit 1, all participants received $50 in compensation. At visit 2, participants in the Lumen intervention arm could choose to receive $100 in compensation or keep the iPad with their access to Lumen deactivated. For those in the waitlist control arm, at visit 2, participants could choose to receive $100 in compensation or choose to attend a Lumen orientation session and receive a Lumen PST-enabled iPad (which they could keep in lieu of the $100 compensation).

Randomization and masking

Participants were randomly assigned in a 2:1 ratio to receive the Lumen intervention or to be in a waitlist control group using a validated online system [21] based on Pocock’s covariate-adaptive minimization [22]. The 2:1 allocation allowed more participants to receive the Lumen intervention without substantially reducing statistical power [23]. Pocock’s minimization method was used to achieve better-than-chance marginal balance across multiple baseline characteristics: age, sex, race/ethnicity, education, PHQ-9 score, GAD-7 score, and Digital Health Literacy [24]. Investigators, the safety monitor, outcome assessors, and data analysts were blinded to participants’ treatment assignment.


Lumen is a virtual voice-based coach developed on Amazon’s Alexa platform. Lumen delivers an evidence-based PST program [5, 6] consisting of eight sessions (four weekly, followed by four biweekly sessions) for patients with mild-to-moderate depression and/or anxiety. PST is patient-driven, where the coach acts as a guide to identify a problem, set a goal, brainstorm solutions, choose a solution, develop an action plan, and to implement and evaluate the plan [25]. This stepwise approach makes PST appropriate for therapy delivery using a virtual voice-based coach.

Lumen was designed through an iterative user-centered process that involved software developers, interaction designers, psychiatrists, PST experts, and behavioral scientists. Several iterations of the prototype were internally tested; a fully functional prototype underwent feasibility and usability testing with 26 users [19]. The design was driven by two key principles: (a) aligning participants’ voice-based interaction with Lumen similar to the cognitive processes of human communicative interactions [26], and (b) configuring the content of the interactions with the principles and process of evidence-based PST. Towards this end, the Lumen architecture included multiple, interacting components that managed voice-based therapy delivery (a conversation manager), and ascertaining persistence and consistency across the eight therapy sessions (a context manager; see additional information in Supplementary Material sections B, C, and D; Figure S1, Table S1, and Table S2).

For this study, Lumen was integrated within the Alexa app on an iPad provided to all participants. Lumen participants attended an in-person orientation session with a trained health coach where they received their iPad, intervention workbook, and completed a tutorial on how to interact with Lumen. Participants were instructed to begin their PST right away, within 1 week of their orientation session and the health coach helped schedule their 4 weekly and the following 4 biweekly PST sessions. Within 3 days of their first scheduled PST session, the health coach called participants to inquire about any technical issues and helped troubleshoot these issues (if any). Participants received reminder text messages about their upcoming and overdue (if any) PST sessions. Participants with overdue sessions, even after their reminders, were called by the health coach and encouraged to complete their outstanding session(s). Participants also had the opportunity to reach out to the health coach if they faced any issues as part of their study.

For each session, participants instantiated Lumen PST through the Alexa app with a “Launch Lumen Coach” voice instruction and completed their assigned PST sessions. A typical Lumen session lasted ~12 min. Between sessions, participants completed surveys and ecological momentary assessments (EMAs, see Supplementary Materials, Section D, Table S2).

Waitlist control

Participants in the waitlist control arm received automated text messages to complete surveys and EMAs at intervals similar to the intervention arm. These participants could choose to receive a Lumen-enabled iPad after their end-of-study assessments at 16 weeks.

Neural target measures

Blinded outcome assessors conducted standardized assessments at baseline and 16 weeks. Task-based functional magnetic resonance imaging (fMRI) data were collected utilizing previously-established standardized fMRI sequences and parameters [27, 28] that inform transdiagnostic phenotypes of neural circuit dysfunction for depression and anxiety. These fMRI methods, including facial expressions task and Go-NoGo tasks, have been standardized in previous work designed for application to precision psychiatry and target engagement studies [29, 30]. A brief description of these tasks are provided below, and additional details can be found in the Supplementary Materials (Section E).

Facial expressions task

A standardized set of 3D evoked facial expression stimuli was presented in pseudorandom order, with 5 repeated blocks of 8 stimuli per block for sad, fear, anger, and happy relative to neutral blocks [29]. Participants were instructed to continuously view the faces and were informed beforehand that they would be asked post-scan questions about the faces they were viewing. To assess amygdala activation for the negative affect circuit, our analysis focused on threatening faces only, given our prior research showing threat-related amygdala activation mediating the effect of in-person PST on depression and problem-solving outcomes [31]. Threat stimuli included a combination of fear and anger stimuli relative to neutral blocks. During the conscious viewing condition, each face was presented for 500 ms, with an interstimulus interval of 750 ms. To elicit the negative affect circuit in response to non-conscious threat stimuli, the same fear and anger stimuli were presented in a backward-masking design to prevent awareness. In this non-conscious condition, face stimuli were presented for 10 ms followed immediately by a neutral face mask stimulus for 150 ms, and with a stimulus onset asynchrony of 1250 ms to match that of the conscious condition [32].

Go-NoGo task

For the Go-NoGo paradigm, the ‘Go’ and ‘NoGo’ stimuli were presented for 500 ms each with an interstimulus interval of 750 ms. The Go-NoGo paradigm allowed for event-related analysis and is used to assess impulsivity (automatically generated ‘Go’ responses) versus inhibition (‘NoGo’ responses). In the ‘Go’ trials, participants were asked to press a button on GREEN stimuli as quickly as possible (with the word “press” displayed in green); in the ‘NoGo’ trials, participants should withhold button presses on RED stimuli (with the word “press” displayed in red). The probability of ‘NoGo’ stimuli was 0.33. A total of 180 ‘Go’ and 60 ‘NoGo’ stimuli were presented in a pseudorandom order with a constraint to ensure that ‘NoGo’ stimuli were not repeated more than 3 times in a row. Reaction times and number of errors on task were used to evaluate task performance [29].

Informed by previous findings [6, 31] identifying neural targets engaged by in-person PST, the primary target regions of interest (ROIs) were the amygdala (bilaterally) representing a key node in the negative affect circuit, and the dorsal lateral prefrontal cortex (dlPFC) (bilaterally), a key node in the cognitive control circuit. The negative affect circuit was engaged by the viewing of threat faces in the non-conscious viewing condition. The cognitive control circuit was engaged using the Go/No-go task.

Person-level activation of the ROIs for each contrast of interest for each task (e.g., threat versus neutral faces, no-go versus go) was derived in a manner consistent with the methods used in prior studies [27].

Clinical outcome measures

On the Hospital Anxiety and Depression Scale (HADS) [33, 34], depression and anxiety symptom scores ranged from 0 to 21, with 0–7 indicating normal; 8–10 indicating borderline abnormal (borderline); and 11–21 indicating abnormal (case). HADS total scores were computed as the sum of depression and anxiety scores, indicating overall psychological distress.

Self-reported measures

Validated self-report surveys of PST theory-based constructs of emotion (affect, worry) and cognition (problem-solving, dysfunctional attitudes) were also completed at baseline and 16 weeks. The Positive and Negative Affect Schedule (PANAS) assessed positive and negative affect [35], with scores ranging from 10 to 50 and higher scores representing higher levels of positive or negative affect. Worry was measured using the Penn State Worry Questionnaire (PSWQ), with a higher total score indicating more worry (range 16–80) [36]. The Social Problem-solving Index-Revised Short Form (SPSI-R:S) assessed total problem-solving ability, with the higher score indicating more productive problem-solving skills, and 5 subscales including problem orientation (positive, negative) and problem-solving styles (rational, impulsive/careless, and avoidant) [37]. Each subscale was scored by summing the respective 5 items (each from 0 to 4), and the total problem-solving ability score ranged from 0 to 20 by averaging the subscale scores. Dysfunctional Attitudes Scale (DAS) measured the presence and intensity of dysfunctional attitudes, with higher scores indicating more dysfunctional attitudes (range 40–280) [38].

Statistical analysis

The intervention vs. control effects on changes in neural targets and self-reported measures of emotional reactivity and cognitive control from baseline to 16 weeks were assessed using t tests. Correlations of changes in neural targets with changes in self-reported measures were estimated using Pearson’s correlation tests.

The intervention vs. control effects on changes in HADS scores from baseline to 16 weeks were tested using ordinary least square regression with adjustment of baseline values of the outcome measure. Each model included all participants with follow-up data on the outcome at 16 weeks, and participants were analyzed based on the group to which they were assigned. Moderation analysis used the same models as above plus the main effect of each potential effect modifier (e.g., sex) and its interaction with the group; the latter, if significant, rejected the null hypothesis of no moderation. Model-adjusted between-treatment mean differences with 95% confidence intervals (CIs) for the overall sample and the subgroups defined by the effect modifiers were reported. Cohen’s d was calculated by the mean difference between the two groups divided by the pooled standard deviation.

Given that this study was a pilot RCT, the primary purpose was to establish a reliable signal regarding the impact of Lumen on neural targets and clinical outcomes that would be promising enough to warrant further research. Towards this end, we used Cohen’s d ≥ 0.3 to define the meaningful mean difference between the intervention and control groups in neural target and symptom changes from baseline to 16 weeks. Moreover, our approach to data reporting and interpretation regarding the intervention effects on neural targets and symptom outcomes was focused on the magnitude and precision (95% CI) of the effect estimates, and not on p-values [5]. Similarly, we were not focused on smaller correlations (Pearson’s r < 0.4) between the neural targets and self-reported measures as it would have limited clinical relevance.

All analyses were conducted using SAS, version 9.4 (SAS Institute Inc., Cary, North Carolina).

Sample size calculation

The sample size of this pilot RCT was calculated using a confidence interval approach. To obtain a precision interval with a standardized half-width of 0.50 (akin to a medium effect) with 90% assurance, we had planned a sample size of 60 (nTreatment = 40, nControl = 20), assuming ≥85% retention at 16 weeks. A precision interval approach was used where we defined that, compared with the waitlist control group, the intervention group will demonstrate a meaningful improvement in outcomes (in both neural targets and symptoms) if the standardized between-group mean difference was at least Cohen’s d = 0.3 in favor of intervention. At this effect size, the upper limit of the precision interval overlaps with d = 0.8 (large effect) given a standardized half-width of 0.5 with 90% assurance that the interval contains the true mean difference based on power analysis. For the correlation of change in neural targets with change in self-reported measures, a sample size of 51 (i.e., 60 × 85%) would be sufficient to detect a coefficient of r = 0.4 with 80% power and 2-sided α = 0.05.

Source link

Related Articles

Leave a Reply

Stay Connected

- Advertisement -spot_img

Latest Articles

%d bloggers like this: