Demographic, radiographic, surgerical, and endocrinological information is summarized in the Supplementary Table 1. CA (36.5%) were the most prevalent subtype followed by SA (24.7%) and LA (24.5%). The mean age of diagnosis was 44.89 (Standard Deviation (SD): ± 15.15) years with 269 females (68.6%) and 123 males (31.4%). In terms of sex distrubtion for particular histological subgroups: for LAs 48 females (45.8%) and 52 males (54.2%), for CAs 119 females (83.2%) and 24 males (16.8%), for SAs 63 females (64.9%) and 32 males (35.1%), for MSAs 21 females (70%) and 9 males (30%), and for mixed cell GH/PRL adenomas 6 females (37.5%) and 10 males (62.5%). Macroadenomas comprised 248 (63.3%) of the cohort. 286 (73%) patients had a low Knosp grade (0–2), and 106 (27.0%) with a high Knosp grade (> 3) suggestive of cavernous sinus invasion (CSI). Extension into the suprasellar, anterior fossa, and posterior fossa were seen in 145 (37.0%), 30 (7.7%), and 7 (1.8%) patients, respectively. All patients underwent TSS with 259 (66.1%) with an endoscopic technique. A GTR was achieved in 284 (72.4%) patients. In terms of postoperative complications, 24 (6.1%) patients required repair of a CSF leak, 44 (11.2%) developed hypopituitarism, and 66 (16.9%) experienced DI.
Biochemical values are highlighted in Supplementary Table 1 and Fig. 2. For CAs, the median preoperative ACTH value was 74.00 pg/mL (Interquartile Range (IQR): 45.00–105.00 pg/mL) and mean serum cortisol following 1 mg DST was 14.27 ug/dL (SD: ± 8.41 ng/dL). Postoperatively, the median POD1 ACTH value was 21.50 pg/mL (IQR: 13.00–28.00 pg/mL) and the median POD1 serum morning cortisol level was 5.00 ug/dL (IQR: 2.10–13.00 ug/dL). 69.8% (N = 68) of patients with LAs received preoperative dopamine agonist therapy. For LAs, the median preoperative PRL index was 5.05 (IQR: 2.00–19.43) and median POD1 PRL index was 0.91 (IQR: 0.303–3.700). For all GH-secreting adenomas the perioperative somatotrophic hormone values were the following: mean preoperative IGF-1 index of 2.78 (SD: ± 1.04), median POD1 GH of 1.43 ng/mL (IQR: 0.7250–3.215 ng/mL), and a median POM3 IGF-1 index of 1.09 (IQR: 0.780–1.700).
Defining predictive preoperative and postoperative biochemical lab values
Using ROC curves, we established biochemical cut-off values predictive of disease remission (Fig. 3). Preoperative IGF-1 index < 3.26 (Sensitivity (SENS) 0.833, Positive Likelihood Ratio (LR +) 1.46), POD1 GH < 1.37 ng/mL (SENS 0.813, LR + 7.46), and POM3 IGF-1 index < 1.26 (SENS 0.900, LR + 9.89) were established for all GH-secreting adenomas. Preoperative PRL index < 9.57 (SENS 0.845, LR + 2.14) and POD1 PRL index < 1.48 (SENS 0.897, LR + 4.875) were established for LAs. Finally, a POD1 ACTH < 25.5 pg/mL (SENS 0.838, LR + 2.52) and POD1 AM cortisol < 14 μg/dL (SENS 0.971, LR + 11.70) were established as biochemical cut-off values for CAs.
Factors predictive of postoperative disease remission
261 (66.6%) patients achieved postoperative disease remission. TAs had the highest rate of remission (N = 9, 90%), followed by CAs (N = 111, 77.6%), SAs (N = 60, 61.9%), LAs (N = 58, 60.4%), MSAs (N = 17, 56.7%), and mixed GH/PRL adenomas (N = 6, 37.5%). Univariate analysis revealed age, sex, adenoma diameter, macroadenomas, GTR, low Knosp grade (0–2), extracapsular resection, MSAs, mixed GH/PRL, and CAs to have an association with postoperative disease remission (Fig. 4A). These variables were incorporated into multivariate binary logistic regression (Fig. 4B). Multivariate analysis found male sex (Odds ratio [OR] 0.507, 95% Confidence Interval (CI) 0.310–0.829), increasing tumor size (OR 0.963, 95% CI 0.933–0.995), macroadenomas (OR 0.381, 95% CI 0.212–0.685), MSAs (OR 0.113, 95% CI 0.013–0.977), and mixed GH/PRL adenomas (OR 0.067, 95% CI 0.007–0.665) to be negative predictors of postoperative disease remission. While low Knosp grade (Knosp 0–2) (OR 1.74, 95% CI 1.015–2.996) and GTR (OR 4.245, 95% 2.525–7.137) were positive predictors of postoperative disease remission. No other variables were associated with postoperative disease remission in this group of patients.
In adenoma multivariate subgroup analysis, we did not identify demographic, radiographic or operative variables predictive of postoperative disease remission (Fig. 4A and B). However, we did identify strong pre-and postoperative biochemical predictors: for CAs POD1 serum ACTH < 25.5 pg/mL (OR 7.647, 95% CI 2.461–23.763) and POD1 morning serum cortisol < 14 μg/dL (OR 14.577, 95% CI 4.025–52.790), for LAs preoperative PRL index < 9.57 (OR 5.079, 95% CI 1.212–21.279) and POD1 PRL index < 1.48 (OR 41.824, 95% CI 11.719–149.267), and for all GH-secreting adenomas PO3M IGF-1 < 1.34 (OR 95.615, 95% CI 18.928–483.010) and POD1 GH < 1.37 ng/mL (OR 31.9, 95% CI 6.298–161.585). Additionally, we performed separate multivariate subgroup analysis replacing the dichotomous biochemical cut-off predictors with their linear scaled continuous variables (supplementary table 2). Similarlly to the dichotomous biochemical cut-off predictors, we found: for CAs increasing POD1 serum ACTH (OR 0.916, 95% CI 0.868–0.967) and POD1 morning cortisol (OR 0.923, 95% CI 0.866–0.984), for LAs increasing preoperative PRL index (OR 0.944, 95% CI 0.902–0.987) and POD1 PRL index (OR 0.452, 95% CI 0.106–0.508), and for all GH-secreting adenomas increasing POM3 IGF-1 Index (OR 0.01, 95% CI 0.001–0.099) and POD1 GH (OR 0.232, 95% CI 0.106–0.508) to be predictive of negative predictors of postoperative disease remission.
Variables found statistically significant on multivariate analyses were implemented to create our novel Pit-SCHEME (Sex, Cavernous Sinus Invasion, Histology, Extent of Resection, Macro/Microadenoma, Endocrinological values) score used to predict postoperative disease remission (Table 1). The score includes categories common to all histological subtypes of functional pituitary adenomas (sex, adenoma size, extent of resection, Knosp grade, histology). In other words, all patients are assigned a score in each of these categories. To account for the unique endocrinopathies associated with each histological subtype, (except for TAs) patients can be assigned additional points based on biochemical lab values related to their histological diagnosis. When devising the numerical portion of the score (i.e. point assignment) we aimed to not only create an accurate scoring system, but also an intuitive scoring system that is user friendly in the clinical setting. Thus, we designed the score to have limited variation in point assignments for each subcomponent (0 or 2/0 or 1) with a total score range of 0–10 (Table 1). For each category we approached point assignments using a holistic approach considering the magnitude of odds ratios, the percentage of our cohort with predictive factors that achieved disease remission, and the preexisting literature displaying the utility of the factor in predicting postoperative disease remission. The extent of resection, radiographic evidence of cavernous sinus invasion and macro/microadenoma categories were shown to have the highest and lowest OR magntiudes (4.245, 1.740, 0.381, respectively). Theferefore, these categories are award either zero (Knosp Grade > 3, macroadenoma, STR) or two points (Knosp Grade ≤ 2, microadenoma, GTR) (Table 1). In terms of patient sex, the score awards one point for females and zero points for males (Table 1). The subcomponent score for the sex category was based on the odds ratio for male sex (OR = 0.507) being greater than the other negative predictor (macroadenoma, OR = 0.381), and having relatively less evidence supporting postoperative disease remission in the preexisting literature when compared to variables like Knosp Grade, adenoma size and extent of resection12,13,14,15,16. In the histology category the following subtypes are represented: CA, LA, TA. SA, MSA, and mixed cell GH/PRL adenomas. Only GH and PRL co-secretion (MSA + mixed cell GH/PRL adenomas) subtypes were statistically significant in multivariate analysis (Fig. 4) and both displayed ORs < 1 (i.e., are negative predictors of postoperative disease remission). Given this, all GH and PRL co-secreting adenomas are given a score of 0 (Table 1). The other histological subtypes CA, LA, TA, and SA achieved greater than 60% remission, however none reached statistical significance. Thus, we decided to score these histologicall subtypes equally, that is they are all awarded one point in the Pit-SCHEME score (Table 1). The first five categories featuring common variables (sex, cavernous sinus invasion, histology, extent of resection, and microadenoma/macroadenoma) can produce a score ranging from 0 to 8. The last section of the Pit-SCHEME score involves endocrinological laboratory values and is segmented into three subcategories: lactotrophs, corticotrophs, and all GH-secreting adenomas (Table 1). Each subcategory contains two individual biochemical levels identified as predictive factors for postoperative disease remission (Fig. 4). A patient is rewarded one point for each biochemical level they are below (Table 1). Thus, according to individualized biochemical levels a patient can receive an additional 0–2 points. When all the categories are combined the Pit-SCHEME score has a range of 0–10 points (Table 1). When evaluating for multicollinearity between the individual subcomponents and the Pit-SCHEME score all variables displayed an independent relationship (met cut-off threshold of a tolerance > 0.1 and VIF < 10). The Pit-SCHEME score achieved a AUROC of 0.858 (95% CI 0.820–0.895) for all patients in our cohort. In terms of a predictive Pit-SCHEME cut-off score, we suggest a score ≥ 6 to be predictive of disease remission (SENS: 0.858, LR + 2.88) (Fig. 5). In the current cohort, patients with a Pit-SCHEME score ≥ 6 had a remission rate of 85.2%.
Supervised machine learning
After training and cross-validation of the six SML models, they were tested on an independent testing set (Fig. 6 and Table 2). Without the Pit-SCHEME score, the CART and RF models achieved the highest accuracy at 72% (95% CI 60–81). The naïve Bayes model displayed the greatest sensitivity at 65% (95% CI 44–83); CART, kNM, and RF had the highest specificity at 79% (95% CI 65–89). Model performance was increased with the inclusion of the Pit-SCHEME score, with the RF model achieving the highest AUR-ROC, accuracy, and sensitivity at 0.970, 85% (95% CI 75–92), and 78% (95% CI 58–91), respectively. For all models, the inclusion of the Pit-SCHEME score improved model specificity. Without the the Pit-SCHEME score, the variable of most importance in the model was adenoma size. Subsequent addition of the Pit-SCHEME score, displayed it as the variable of most importance in our supervised machine learing models. Finally, a statistically significant difference in model prediction accuracy (without vs. with Pit-SCHEME score) for the LDA model (p = 0.0002), CART (p = 0.006), SVM (p < 0.0001) and RF (p < 0.0001).