Patient cohort and dataset
A total of 114 images of paired pre-SVR and post-SVR biopsies were collected from 57 patients. Of the original 64 participants, six were disqualified due to incomplete information. Another patient was excluded due to virus genotype mutation and absence of SVR. Clinical blood analyses confirmed that all other patients had achieved an SVR and had completely eradicated the virus. The 29 male and 28 female patients were between 18 and 61 years of age (mean age, 44.27 ± 10.99 years). Histological fibrosis stages of each patient for both pre-SVR and post-SVR biopsies were independently and blindly reviewed by a panel of pathologists using Masson trichrome stained slides according to the Ishak scoring system40 in parallel. In the fibrosis assessment of biopsy specimens, paired Ishak scores are presented (pre-SVR and post-SVR) as the reference standard. Subsequently, each patient’s fibrosis status was classified into the following two groups: (i) “Reversible”: Fibrosis is reversed, Ishak score is decreased; and (ii) “Irreversible”: Fibrosis degree and Ishak score remain the same or increased.
According to independent and blinded pathological assessment of fibrosis stage, 21 patients (13 males and 8 females) improved post-SVR (“Reversible” group), and 36 patients (16 males and 20 females) developed a same or worsening of fibrosis (“Irreversible” group).
Collagen classification into two modes: Aggregated Thick Collagen and Dispersed Thin Collagen
It is a traditional oversimplification to characterize collagen remodeling as a simple increase or decrease in total collagen content. Novel insight into the differences in collagen structures between various regions is crucial to understand the mechanism of liver fibrosis reversibility. Due to the technical limitation, the degree/extent of collagen aggregation is historically absent in the contect of fibrosis studies. We provided a novel computational solution based on fiber SHG signal intensity, texture and morphology to classify the collagen compartments into two distinct modes: Aggregated Thick Collagen (ATC) and Dispersed Thin Collagen (DTC). The original TPE (red channel) and SHG (green channel) signals of a representative image are shown in Fig. 2A. ATC area is the region containing highly concentrated and aggregated collagen fibers indicated by the yellow arrow in Fig. 2A. Our classification solution is able to distinguish the ATC area successfully, as shown by the yellow arrow in Fig. 2B. On the other side of the image, the collagen is evenly and sparsely distributed in the DTC area, which is highlighted by the white arrow in Fig. 2A,B.
Image analysis and feature extraction
The extracted features of a smaller piece of tissue within the blue square in Fig. 2A is amplified and shown in Fig. 2C–E. Based on the two mode Gaussian Mixture Model (GMM), we detect the area occupied by the collagens and preserve their connectivity shown in Fig. 2C. Collagen Area Ratio (CAR) was measured based on the ratio of the red and blue regions within the area of the collagen binary image (ATC area and DTC areas, respectively, Fig. 2B) to the total collagen area (ATC CAR and DTC CAR, respectively). Compared with DTC areas, the collagen often occupies proportionately more area in ATC areas, where collagen is more aggregated and denser. According to our data, ATC CAR is often > 80% while DTC CAR is generally < 20%.
In Chronic HCV, the fibrosis process is portal based. It starts with portal fibrosis, periportal fibrosis, bridging septa between portal-portal, portal-central and central-central regions, and ultimately, cirrhosis. Portal tracts with their original native collagen are incorporated into most or at least many ATC areas. Skeletonization was applied to the collagen binary image in order to identify the central line of the fiber network, as shown by the red solid line in Fig. 2D. A similar approach is also applied to the TPE channel, and the results are shown in Fig. 2E. The structural features, such as Collagen Fiber Length (CFL) illustrated in Fig. 2F and Collagen Fiber Thickness (CFT) in Fig. 2G, were calculated accordingly. The junction points of the collagen network were then identified as the yellow dots in Fig. 2H. The density of the junction points quantifies the complexity of the collagen or tissue networks. Supp. Table 1 provides a non-comprehensive list of the key features we extracted together with their definitions. Similar to ATC CAR and DTC CAR, all other extracted features are calculated within the ATC area and DTC area separately. For example, CFL will have ATC CFL and DTC CFL.
Evaluation of quantitative collagen structural features
We performed univariate analyses for feature selection based on t-test evaluation to assess the prognostic value of each individual collagen feature for HCV fibrosis reversibility. The ATC Area Ratio is a measurement to quantify the relative amount of given tissue is occupied by the aggregated collagen, defined as the ratio between the area of red region in Fig. 2B and the total tissue area (the combination of red and blue regions). ATC Area Ratio is distinct from ATC CAR since ATC area is generally not fully occupied by collagen compartments, whereas ATC CAR is measured only within collagen areas.
Our data shows that ATC Area Ratio has almost no correlation with Ishak scores for the pre-SVR and post-SVR biopsies in the reversible group given R = 0.152 and R = 0.133, respectively in Supp. Fig. 1A,C. However, for the irreversible group, there is a stronger correlation between ATC Area Ratio and Ishak scores of the pre-SVR and post-SVR biopsies given R = 0.688 and R = 0.477, respectively Supp. Fig. 1B,D. This indicates that collagen aggregation plays a key role in fibrosis reversibility. Although the reversible and irreversible cases do not demonstrate the statistical significance of ATC Area Ratio pre-SVR, the ATC Area Ratio has a statistically significant difference between the two groups of patients post-SVR (Fig. 3A). This result indicates that SVR will reduce the ATC Area Ratio of both reversible and irreversible cases, but more significantly in the reversible group, supporting the view that collagen aggregation is one of the key factors that affect fibrosis reversibility.
Evaluation of the extracted quantitative collagen structural features between the reversible and irreversible patient groups. (A) The statistical significance is not shown in ATC Area Ratio of reversible and irreversible groups pre-SVR treatment. After SVR Treatment, ATC Area Ratio of both groups decreased and significant difference is observed. (B) Four selected features with statistical significance, named ATC CAR, ATC CFT, ATC TRI and DTC CFS, between reversible (green line) and irreversible (red line) patient groups pre-SVR treatment. (C) The profiles defined by these four given features of the reversible (green line) and irreversible groups (red line) post-SVR treatment.
Next, we analyzed the quantitative structural features extracted from ATC and DTC regions. Four features (ATC CAR, ATC CFT, ATC TRI and DTC CFS) demonstrated significant differences associated with the reversibility of HCV fibrosis (Fig. 3B). As shown in the figure, Collagen Area Ratio (CAR) is one of the fundamental features measuring the abundance of collagen in the liver tissue. From a practical understanding, portal-based, aggregated fibrosis with septa formation is the characteristic pattern of fibrosis in chronic HCV livers; hence, a change in ATC CAR was to be expected. However, perisinusoidal and dispersed fibrosis is usually not an issue in HCV cases, unlike in NAFLD/NASH cases; hence, no significant change in DTC was anticipated. Indeed, ATC CAR showed a significant difference between reversible and irreversible cases (p = 0.006) (Fig. 3B), while DTC CAR did not (data not shown). The ATC CAR values are correlated with Ishak scores for both reversible (R = 0.509) and irreversible cases (R = 0.654) pre-SVR (Supp. Fig. 2A,B), while the correlation coefficient decreases for both reversible and irreversible groups post-SVR (Supp. Fig. 2C,D). This indicates that ATC CAR has predictive value in the prognosis of liver fibrosis reversibility between the two patient groups. The risk of irreversibility is higher when a patient has more collagen in the aggregated mode, i.e. ATC CAR is higher.
In the ATC regions, Collagen Fiber Thickness (ATC CFT), a measure of the girth of each collagen fiber, is also higher in the irreversible than the reversible cases (Fig. 3B) (p = 0.036). This result is consistent with the higher ATC CAR found in irreversible cases; it is intuitive that if the collagen fiber is thicker, it tends to occupy more area. We did not observe any significant difference between the two groups in DTC CAR; however, we noticed that the collagen fibers in the DTC mode are straighter, i.e. DTC collagen fiber straightness (DTC CFS) is higher, in the irreversible HCV group compared with the reversible group (Fig. 3B) (p = 0.050). In general, thicker collagen fiber in the ATC region and straighter collagen in the DTC suggest less likelihood of fibrosis reversibility. This structural difference in these different collagen modes implies that fiber packing is different in reversible and irreversible cases.
One significant feature extracted from the TPE channel between the two groups is shown in Fig. 3B. In the ATC area, we found that Tissue Reticular Index (ATC TRI), a measurement of the tissue structural complexity, is significantly higher in the irreversible group’s samples, suggesting that tissue structures form a more complex network in this group. The increase in ATC CAR is associated with higher ATC TRI, although the mechanism is not clear at this juncture. We next explored the change of these four features post-SVR (Fig. 3C). After SVR, ATC TRI and DTC CFS both lose their significance; however, the difference between ATC CAR and ATC CFT is dramatized, i.e. p-value further decreased. For example, the p-value of ATC CAR decreased from p = 0.006 to p = 0.001. We conclude that the reversible and irreversible groups have significant difference pre-SVR (Fig. 3B), and their responses in collagen structure post-SVR are also distinct (Fig. 3C).
Relative changes of quantitative collagen and tissue structural features pre-SVR and post-SVR
To further understand the relative changes between the two patient groups pre-SVR and post-SVR, we present a comparison of our data in Fig. 4. Although there are some structural differences for the reversible group, the differences are not statistically significant except for ATC CAR (Fig. 4A). The small differences in the four features pre-SVR and post-SVR indicate that the fibrosis of reversible group is slightly improved and does not worsen. The irreversible group’s results pre-SVR and post-SVR are presented in Fig. 4B. The fiber girth as measured by the ATC CFT shows a clear trend of increase post-SVR with clear significance. Another significant feature for the irreversible group is the structural complexity as measured by the ATC TRI. The decrease of ATC TRI might be associated with the increase of ATC CFT since more area is occupied by the collagen, which may cause a simpler network of the cellular tissue, or vice versa. However, we did not observe the same result in the reversible group (Fig. 4A). Further investigation is required to establish the physiological mechanisms behind ATC TRI decreasing and ATC CFT increasing for the irreversible group.
The relative changes of collagen features in reversible and irreversible patient groups pre-SVR and post-SVR. (A) The features of reversible patient were slightly improved, i.e. shrank from the blue line (pre-SVR) to the yellow line (post-SVR), especially for ATC CAR, which shows statistical significance. (B) The irreversible group’s profile is noticeably different from that of the reversible group. The ATC CFT increased and the ATC TRI decreased post-SVR with statistical significance. The decrease of ATC TRI is potentially associated with the increase of ATC CFT since the larger area occupied by the collagen might lead to a simpler network of the cellular tissue. (C) The absolute differences of the reversible (green line) and irreversible (red line) groups’ profile pre-SVR and post-SVR. (D) The relative differences in percentage of the reversible (green line) and irreversible (red line) groups’ profile pre-SVR and post-SVR.
The absolute differences and relative changes in percentage of those four features pre-SVR and post-SVR are presented in Fig. 4C,D. In general, for the reversible group, all four features remain the same or are slightly improved, as shown by the green line in Fig. 4C,D, compared with their baseline values (black line), while for the irreversible group, the ATC CFT has clearly increased, which may be associated with the unknown mechanism of fibrosis reversibility.
Prognostic features selection and the optimization of predictive models
Based on our patient cohort and collagen structural analysis, the above four quantitative image-based collagen structural features show promising prognostic value of HCV-induced liver fibrosis reversibility. It is critical to understand the dependence of the features and build an optimized predictive model. In the ATC area, there are two significant features: collagen area ratio (ATC CAR) and collagen fiber thickness (ATC CFT). In general, higher ATC CAR and ATC CFT indicate higher risk of irreversibility even after SVR. However, ATC CAR and ATC CFT are not completely independent. As shown in Supp. Fig. 3A, the linear relationship between these two features shows a strong correlation as indicated by the black dotted line. The increase in ATC CFT post-SVR reveals that collagen fibers became thicker in the irreversible group, whereas ATC CFT remains almost the same for the reversible group. This implies that the collagen packing pattern is different in the two groups. In the DTC area, the collagen fiber straightness (DTC CFS) has significant predictive value for fibrosis reversibility. The structural complexity measurement (ATC TRI) extracted based on the two-photon excitation (TPE) channel is also important, although the mechanism needs further exploration, and we did not include it in our predictive model while worth to further investigate.
The three collagen features, i.e. ATC CAR, ATC CFT and DTC CFS, have the potential to build a practical predictive model for clinical use. Based on our data thus far, the aggregated collage area ratio and dispersed fiber straightness (ATC CAR and DTC CFS, respectively) form the selected two prognostic features with predictive value of liver fibrosis reversibility in HCV patients since both metrics are significantly higher in the irreversible group.
The significant statistical differences between the selected features of pre-SVR biopsies enable us to prototype a predictive model of fibrosis reversibility. Such predictive models are essential for clinical practice to stratify patients in the healthcare system and select suitable patient candidates for clinical trials. To optimize our predictive model building, we used three different methods to build a clinical predictive model for HCV-induced liver fibrosis reversibility: (1) Bayesian model, (2) Support Vector Machine (SVM) and (3) Relevance Vector Machine (RVM).
To optimize our predictive model, we evaluated different methods according to their performance and visualized decision boundary in the 2D feature space. The models based on SVM and RVM methods are presented in Supp. Figs. 4, 5. Although SVM model in Supp. Fig. 5A,C achieved the best performance, the complicated decision boundary in 2D feature space may indicate a rick of overfitting due to limited sample number. RVM model is also not ideal due to its poor specificity.
The optimal Bayesian model is presented in Fig. 5, which indicates that the 57 patients can be divided into two groups based on ATC CAR and DTC CFS. In Fig. 5A, the green dots represent the reversible cases, and the red squares represent the irreversible cases. The number of patients with reversible and irreversible fibrosis in each group is also moderately well-balanced, i.e. 21 reversible cases vs. 36 irreversible cases. In general, a higher ATC CAR value and greater DTC CFS are indicative of poor prognosis, whereas a lower ATC CAR value and lesser DTC CFS are indicative of good prognosis (i.e. reversible fibrosis). The two-feature decision boundary is represented by the black solid line between the irreversible (gray) and reversible (white) domains in Fig. 5A. Figure 5B represents the continuous Bayesian predictive model based on ATC CAR and DTC CFS. The red surface indicates the irreversible while the green surface indicates the reversible groups respectively.
The discrete and continuous Bayesian predictive models for the HCV induced liver fibrosis reversibility. (A) The optimized discrete model using two selected features, ATC CAR and DTC CFS, and their distribution in the 2D feature space. (B) The optimized continuous model using ATC CAR and DTC CFS and their probability distribution function in the 2D feature space.
The performance metrics of the discrete and continuous model is presented in Table 1. Based on both the discrete and continuous models, we can successfully identify ~ 85% of irreversible cases and about 62% of reversible cases. The selected quantitative features and Bayesian predictive model have a potentially high clinical prognostic value for HCV-induced liver fibrosis reversibility prediction.