Predicting Liver Fibrosis in the Hepatitis C Population: Concordance Analysis Between Noninvasive Scoring Systems and Percutaneous Liver Biopsy

Background Due to the slow progression of many chronic liver diseases, including hepatitis C, it is not practical or safe to monitor disease progression by serial liver biopsies. Noninvasive laboratory scoring systems based on routine laboratory tests are appealing surrogate markers of liver fibrosis for the staging and monitoring of chronic liver diseases such as hepatitis C. Methods We explored the accuracy of three scoring systems: the fibrosis-4 score (FIB-4), the aspartate aminotransferase to platelet ratio index (APRI score), and the aspartate aminotransferase to alanine aminotransferase ratio (AAR) in 496 patients with chronic hepatitis C virus (HCV) infection who had undergone percutaneous liver biopsy at a viral hepatitis clinic in Shreveport, Louisiana. Results For FIB-4, the area under the receiver operating characteristic curve (AUROC) for hepatic fibrosis stages ≥ 1, ≥ 2, ≥ 3, and 4 (cirrhosis) ranged from 0.74 (95% CI, 0.678 - 0.802) to 0.802 (95% CI, 0.751 - 0.854). At a cutoff value of 1.45, FIB-4 was 82% sensitive for advanced fibrosis or cirrhosis (stage 3 or 4) but was only 58% specific for these findings. Increasing the FIB-4 cutoff value to 3.25 reduced the sensitivity for detecting advanced fibrosis or cirrhosis to 39%, but this higher cutoff was 92% specific for these findings. Corresponding AUROCs for the APRI and AAR scores were inferior to FIB-4. Conclusion The FIB-4 index outperformed APRI and AAR in our HCV infected population in predicting severe fibrosis or cirrhosis.


Methods
We explored the accuracy of three scoring systems: the fibrosis-4 score (FIB-4), the aspartate aminotransferase to platelet ratio index (APRI score), and the aspartate aminotransferase to alanine aminotransferase ratio (AAR) in 496 patients with chronic hepatitis C virus (HCV) infection who had undergone percutaneous liver biopsy at a viral hepatitis clinic in Shreveport, Louisiana.

Results
For FIB-4, the area under the receiver operating characteristic curve (AUROC) for hepatic fibrosis stages ≥ 1, ≥ 2, ≥ 3, and 4 (cirrhosis) ranged from 0.74 (95% CI, 0.678 -0.802) to 0.802 (95% CI, 0.751 -0.854). At a cutoff value of 1.45, FIB-4 was 82% sensitive for advanced fibrosis or cirrhosis (stage 3 or 4) but was only 58% specific for these findings. Increasing the FIB-4 cutoff value to 3.25 reduced the sensitivity for detecting advanced fibrosis or cirrhosis to 39%, but this higher cutoff was 92% specific for these findings. Corresponding AUROCs for the APRI and AAR scores were inferior to FIB-4.

Conclusion
The FIB-4 index outperformed APRI and AAR in our HCV infected population in predicting severe fibrosis or cirrhosis.

Introduction
Hepatitis C is a treatable chronic liver disease that if left untreated or if unsuccessfully treated may lead to hepatic fibrosis and eventually to irreversible cirrhosis with complications and the need for a liver transplant [1]. Liver histology is the gold standard for the diagnosis and staging of hepatic fibrosis and cirrhosis in chronic hepatitis C [2]. However, a liver biopsy to determine histopathology can have procedure-related complications and is also limited by the need for expertise in performing the biopsy, cost, observer interpretation, sampling error, and patient unwillingness. Conventional blood tests (serum aspartate aminotransferase (AST), serum alanine aminotransferase ratio (ALT), and platelet count) have been used to try to estimate the degree of hepatic fibrosis, which, if highly predictive of fibrosis, could serve as a surrogate for liver biopsy [3][4][5]. We, therefore, performed a retrospective study in nearly 500 patients with chronic hepatitis C subjected to percutaneous liver biopsy at one institution, comparing the ability of three laboratory-based indices to predict the degree of hepatic fibrosis. Indices examined were the fibrosis-4 score (FIB-4), the aspartate aminotransferase to platelet ratio index (APRI score), and the aspartate aminotransferase to alanine aminotransferase ratio (AAR).

Study population
We performed a retrospective chart review in a viral hepatitis clinic in Shreveport, Louisiana. Hospital electronic health records were screened for the diagnosis of chronic hepatitis C by specific International Classification of Diseases 10th Revision Clinical Modification (ICD-10-CM) code B18.2 [6]. Patients were included if they were greater than 18 years old, had chronic hepatitis C, were seen in the clinic between November 1, 2014, and December 31, 2017, were treatment naïve, had had laboratory testing done for serum aspartate aminotransferase (AST), serum alanine transaminase (ALT), and platelet count, and had had a percutaneous liver biopsy during this period. A total of 496 patients were included. We recorded the patients' age at the time of liver biopsy, their gender, hepatitis C virus (HCV) viral load and genotype, liver biopsy results as well as serum AST and ALT levels and platelet count near the time of liver biopsy. All patients tested negative for human immunodeficiency virus (HIV) and hepatitis B virus infections. Laboratory tests were typically done within the one week period prior to liver biopsy.

Liver Biopsy
All percutaneous liver biopsies were performed by two senior gastroenterologists. The biopsy samples were at least 25 mm long and a minimum of 10 portal tracts was included in each specimen to improve diagnostic accuracy. All liver tissue samples were analyzed twice by one senior hepatopathologist. The METAVIR scoring system was used to assess the extent of hepatic fibrosis. This fibrosis staging score represents the amount of fibrosis, scored from F0 to F4 (Stage F0 = no fibrosis, Stage F1 = mild fibrosis, Stage F2 = significant fibrosis, Stage F3 = severe fibrosis, and Stage F4 = cirrhosis) [7][8].

Statistical analysis
Data were entered in a Microsoft Excel (Microsoft Corporation, Redmond, Washington) sheet, coded to de-identify patients, and analyzed in the Statistical Package for the Social Sciences (SPSS) v22.0 (IBM Corp., Armonk, New York). Percentages were calculated for categorical variables. Means and standard error of means (SEM) were determined for continuous variables. Analysis of variance (ANOVA) was performed to compare mean laboratory values at different grades of fibrosis. Receiver operating characteristic (ROC) curves were drawn and the area under the receiver operating characteristic curve (AUROC) was estimated to compare the diagnostic efficiency of the three noninvasive scores: FIB-4, APRI, and AAR.

Results
The majority of the patients (71%) were middle-aged (36-59 years) and just over half of them were women ( Table 1). The hepatic fibrosis grades for the 496 study patients are shown in Table  2, with only 42 patients having no fibrosis (F0) and 74 having cirrhosis (F4). The F1 and F2 stages were the largest categories (n = 142 and n = 144, respectively). Mean serum AST and ALT increased, and platelet counts decreased, as the extent of hepatic fibrosis increased. Likewise, FIB-4 and AAR (but not APRI) increased stepwise as the fibrosis grade increased, whereas mean APRI and AAR did not increase until F3 and F4 fibrosis was reached ( Table 2). There were no significant differences in calculated non-invasive scores between women and men.     Figure 1A). FIB-4 outperformed AAR and APRI in predicting severe fibrosis or cirrhosis (F3 or F4) ( Figure 1B), moderate to severe fibrosis or cirrhosis (F2 to F4; Figure 1C), or any grade of fibrosis or cirrhosis (F1 to F4; Figure 1D), with AUC values ranging from .732 to .788. Mean FIB-4 scores showed a good correlation with fibrosis grade (R = 0.97; p < 0.001; Figure 2).

FIGURE 1: FIB-4, AAR, and APRI scores in predicting various degrees of hepatic fibrosis
The FIB-4 score consistently had the largest area under the curve for predicting any grade of fibrosis or cirrhosis.

FIB-4, fibrosis-4 score
A FIB-4 index of ≥ 3.25 had a 72% positive predictive value, 92% specificity, and diagnostic accuracy of 0.74 in predicting severe fibrosis or cirrhosis (F3 or F4). A FIB-4 index of ≤ 1.45 had an 86% negative predictive value and 82% sensitivity in excluding severe fibrosis or cirrhosis (F3-F4). The diagnostic accuracy of a FIB-4 cutoff of ≥3.25 was 0.74. The Youden statistic is used for the evaluation of the overall discriminative power of a diagnostic procedure and for comparison of the test with other available tests. A FIB-4 score cutoff of ≤ 1.45 had the highest Youden index, the maximum potential effectiveness of a biomarker (0.4), followed by a FIB-4 score cutoff of ≥ 3.25 (0.31). The diagnostic odds ratio (DOR) of a test is a ratio of the odds of positivity in subjects with the disease to the odds in subjects without the disease. A FIB-4 score cutoff of ≥3.25 had the highest DOR (7.75) followed by the FIB-4 score cutoff of ≤ 1.45 (6.24) ( Table 3).    [4]. Our study results using a different liver biopsy scoring system (METAVIR) were in fairly close agreement with the study of Sterling et al. [3], with 86% NPV and a sensitivity of 82% for a ≤ 1.45 cutoff and a PPV of 72% and specificity of 92% for ≥ a 3.45 cutoff. The positive likelihood ratio was highest (5.12) with a FIB-4 cutoff of ≥ 3.25, and the negative likelihood ratio was lowest (0.31) for a FIB-4 cutoff of ≤ 1.45.

Score Cutoff Sensitivity a Specificity a PPV a NPV
Our study is unique in that it represents a single-center US study with nearly 500 patients, which included mostly HCV genotype 1 and without HIV or hepatitis B coinfection. A singlecenter study helped minimize variations in inter-observer biopsy interpretation and avoided differences in processing in different labs. Other studies performed in other countries and a multi-center study in the USA have yielded similar results of AUROC for the FIB-4 index [9][10][11][12][13][14].
In 2003, Wai et al. developed the APRI score to predict significant fibrosis and cirrhosis in patients with chronic HCV. The authors concluded that the AUC for predicting significant fibrosis was 0.88 and was 0.94 for cirrhosis. The cut-off values in the Wai study were ≤ 0.50 and ≥ 1.50 for predicting the absence or presence of significant fibrosis/cirrhosis (Ishak score ≥3), respectively [15]. Lin et al. performed a meta-analysis in 2011 where an APRI cutoff of 0.7 had 77% sensitivity and 72% specificity for significant fibrosis and a cutoff of 1.0 had 61% sensitivity and 64% specificity for severe fibrosis [16]. Our study showed that an APRI score of 1.0 had a PPV of only 41% and a specificity of 73%, inferior to FIB-4.
Williams et al. observed that both AST and ALT levels rose with the progression of liver damage, specifically in patients with chronic hepatitis, and a ratio of AAR >1.0 would typically suggest cirrhosis with 100% specificity and a PPV in distinguishing cirrhotic from non-cirrhotic patients, with a 53% sensitivity and 81% NPV [17][18]. Our study showed that an AAR score of < 1.0 had an NPV of 71% to exclude severe fibrosis (F3-F4), a sensitivity of only 31%, a positive predictive value of only 53%, and a specificity of 86%, also inferior to FIB-4 using the AUC. Determining fibrosis severity is critical in chronic liver disease, as it predicts long-term clinical outcomes and death in HCV [19]. In contrast with the above studies, including ours, some authors concluded that noninvasive markers are not a reliable tool to predict liver fibrosis. Parkes et al. reviewed 10 different serum markers of hepatic fibrosis in chronic hepatitis C. Only 35% of patients had fibrosis adequately ruled in or ruled out by these panels, and the stage of fibrosis could not be adequately determined [20]. The calculation of FIB-4 can simply be done from routine labs that can be reassessed every accurately [21][22].

Conclusions
The FIB-4 score was the better predictor across all the grades. The AAR ratio was the next best predictor, and the APRI score was inferior as compared to the other two. When the scores were compared within each grade, it was found that the efficiency increases as the grade increases in all three scores. In summary, the FIB-4 score had better diagnostic accuracy than AAR and APRI. These non-invasive scores, particularly FIB-4, do fairly well in ruling out rather than ruling in advanced disease, having higher negative predictive values than positive predictive values.

Additional Information Disclosures
Human subjects: Consent was obtained by all participants in this study. Animal subjects: All authors have confirmed that this study did not involve animal subjects or tissue. Conflicts of interest: In compliance with the ICMJE uniform disclosure form, all authors declare the following: Payment/services info: All authors have declared that no financial support was received from any organization for the submitted work. Financial relationships: All authors have declared that they have no financial relationships at present or within the previous three years with any organizations that might have an interest in the submitted work. Other relationships: All authors have declared that there are no other relationships or activities that could appear to have influenced the submitted work.