Predictive and Prognostic Potential of Liver Function Assessment in Patients with Advanced Hepatocellular Carcinoma: A Systematic Literature Review

Introduction We conducted a systematic literature review to assess the utility of liver function assessments for predicting disease prognosis and response to systemic anticancer therapy in patients with advanced hepatocellular carcinoma (aHCC). Methods This was a PRISMA-standard review and was registered with PROSPERO (CRD42021244588). MEDLINE and Embase were systematically searched (March 24, 2021) to identify publications reporting the efficacy and/or safety of systemic anticancer therapy (vs. any/no comparator) in liver-function-defined subgroups in phase 2 or 3 aHCC trials. Screening was completed by a single reviewer, with uncertainties resolved by a second reviewer and/or the authors. English-language full-text articles and congress abstracts were eligible for inclusion. Included publications were described and assessed for risk of bias using the GRADE methodology. Results Twenty (of 2,579) screened publications were eligible; seven categorized liver function using the albumin-bilirubin system, nine using the Child-Pugh system, four using both. GRADE assessment classified ten, nine, and one publication(s) as reporting moderate-quality, low-quality, and very-low-quality evidence, respectively. Analyses of cross-trial trends of within-exposure arm analyses (active and control) reported a positive relationship between baseline liver function and overall survival and progression-free survival, supporting liver function as a prognostic marker in aHCC. There were also signals for a modest relationship between more preserved baseline liver function and extent of systemic treatment benefit, and with more preserved liver function and lower incidence of safety events. Conclusion This review supports liver function as a prognostic variable in aHCC and highlights the value of a priori stratification of patients by baseline liver function in aHCC trials. The predictive value of liver function warrants further study. Findings were limited by the quality of available data.


Lay Summary
Hepatocellular carcinoma (HCC) is a common type of liver cancer.Patients with advanced HCC often have some degree of liver damage that prevents their liver from functioning properly.When choosing the best treatment for patients with advanced HCC, it is important to understand the likely course of their cancer and how well they might respond to available treatment options.We carried out a review of the data from published clinical trials of advanced HCC treatments to investigate whether the quality of patients' liver function when they start treatment predicts how their cancer will progress and how they might respond to treatment.We found that patients responded to treatment regardless of how well their liver function was working when they started therapy.However, better liver function at the start of treatment was associated with longer survival and possibly also with better treatment response.Well-designed studies are needed to explore these findings further, particularly whether liver function can help predict response to advanced HCC treatment.

Introduction
Hepatocellular carcinoma (HCC) is a leading cause of cancer-related mortality and accounts for the majority of primary liver cancers worldwide [1,2].In recent years, there has been a rapid expansion of systemic, targeted therapies approved for use in patients with advanced HCC (aHCC) [3].This increase in the number of therapeutic options available for aHCC has highlighted the importance of treatment selection and sequencing to optimize outcomes for patients.Central to informed treatment selection is understanding how key clinical characteristics may influence disease course, prognosis, and potential treatment response.
Tumour stage at diagnosis is associated with overall survival (OS) [4], and presence (and extent) of any underlying liver disease is an important clinical consideration when treating patients with HCC [5][6][7].Most patients with HCC have some degree of underlying liver fibrosis (usually at the stage of cirrhosis), and the extent of liver dysfunction can have an impact on disease prognosis and treatment selection [5,6].Although the association between liver function and HCC prognosis is well documented, it has not been established which liver function classification system offers the greatest prognostic discrimination or association with response to systemic therapy in the context of aHCC.
In patients with HCC, liver function has traditionally been assessed by the Child-Pugh system [8], and a higher Child-Pugh score consistently correlates with shorter survival [9,10].Child-Pugh grading is routinely used in clinical practice to indicate the likely prognosis of patients with HCC and underlying liver disease, and a favourable grade is commonly pre-specified among HCC clinical trial inclusion criteria.Yet, the system was originally developed to assess liver function and disease prognosis in patients with liver cirrhosis and in those with portal hypertension undergoing surgery for variceal bleeding, rather than for use in patients with HCC [11,12].The system was developed arbitrarily and is based on (partially subjective) clinical and laboratory assessment without any formal statistical validation [7,11,13], with scope for variability in clinical assessment and resultant scoring [14].In contrast, the albumin-bilirubin (ALBI) grading system is based only on serum albumin and bilirubin and consequently offers a more objective measurement of liver function compared to the Child-Pugh system.The ALBI grading system was developed empirically using data from large international databases to identify objective measures of liver function that independently influence survival in patients with HCC, thereby eliminating the dependence on subjective variables implicit within Child-Pugh assessments [7].
There are consistent reports of ALBI grading offering better discrimination than Child-Pugh grading for predicting disease prognosis in patients with HCC [15][16][17],

Registration and Methodology
The protocol for this systematic literature review was registered with the International Prospective Register of Systematic Reviews (PROSPERO) (registration number CRD42021244588) [18].The review was conducted in accordance with the 2020 Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRIS-MA) guidelines [19,20].

Search Strategy and Selection Criteria
Publications of interest were those reporting empirical data relating to the prognostic and/or predictive potential of liver function assessment from phase 2 or 3 clinical trials of systemic therapies for aHCC.Interventions of interest included tyrosine kinase inhibitors (TKIs), anti-vascular endothelial growth factor (VEGF) or anti-VEGF receptor (VEGFR) monoclonal antibodies (mAbs), and checkpoint inhibitors (CPIs) (see online suppl.Table.S1 at www.karger.com/doi/10.1159/000529173for publication eligibility criteria).Eligible publications were identified by systematic searches using the Ovid bibliographic research platform, supplemented by manual handsearching of congress Websites pre-identified as being subject matter of interest.Ovid searches were run in the MEDLINE (In-Process & Other Non-Indexed Citations and Ovid MEDLINE [1946 to present]) and Embase (1974 to present) databases on March 24, 2021 (see online suppl.Table S2a, b for full search-string and screening details).
Congresses pre-identified as being of interest were those not indexed within Embase but considered by the authors to be events in which relevant data may have been reported, specifically: the European Association for the Study of the Liver (EASL) and the International Liver Cancer Association (ILCA) annual congresses 2019-2021.The congress Websites and online abstract books were manually searched to identify eligible abstracts.Relevant abstracts presented prior to January 1, 2019, were assumed to have since been published as full-text articles and identified by the Ovid searches.Abstract handsearching was carried out between March 18, 2021, and May 6, 2021.Additional to the systematic searches, relevant unpublished trial analyses known to the authors through their research networks at the time of the systematic searches were included if they were published during the data analysis phase of the review (up to October 7, 2021) and met the eligibility criteria or article inclusion.

Publication Eligibility
Publications identified by the searches were screened against pre-specified criteria in accordance with the PRISMA 2020 guidelines [19].The title and abstract of all identified publications were screened for eligibility against the criteria stipulated in the study protocol (summarized in online suppl.Table S1); short-listed publications underwent full-text review to confirm eligibility.Screening was completed by a single reviewer, with uncertainties resolved by a second reviewer and/or by the author group, as required.

Data Abstraction
Standardized data were systematically extracted from eligible publications and recorded in a data extraction table (Microsoft Excel) with data fields pre-agreed by the authors.Data abstraction was conducted by two independent researchers.All relevant outcome data identified a priori in the study protocol were extracted.

Quality and Risk of Bias Assessment
The quality and risk of bias of included publications were appraised using the Grading of Recommendations, Assessment, Development and Evaluations (GRADE) quality assessment tool.Each publication was assigned a quality rating (high, moderate, low, or very low) according to its level of potential bias, which was assessed in terms of data quality, consistency, directness, and modifying factors (precision and sparsity of data and probability of reporting bias) [21,22].The GRADE rating of each eligible publication is included in online Supplementary Table S3; details of the full GRADE quality appraisal are included in online Supplementary Table S4.

Definition of Prognostic Versus Predictive Results
A methodical approach was applied to categorize extracted data according to their prognostic or predictive potential.

Prognostic Data
Prognostic data were defined as those reporting outcomes for within-exposure groups (either active treatment or placebo), with the only differentiating factor between the groups being baseline liver function status: for example, OS or progressionfree survival (PFS) hazard ratios (HRs) for liver-function-defined subgroups ALBI 1 versus ALBI 2 and ALBI 1 versus ALBI 3 within a single exposure group.A difference in OS or PFS (shown by either HRs or absolute values) between groups stratified according to liver function was considered potentially indicative of a prognostic signal.

Predictive Data
Predictive data were defined as those reporting comparative treatment outcomes (investigational drug vs. placebo or vs. active comparator) in patients stratified by baseline liver function, thereby allowing a comparison of the magnitude of treatment effect in each liver-function-defined subgroup.For example, OS HR for active treatment versus placebo in the ALBI 1 subgroup was compared with OS HR for active treatment versus placebo in the ALBI 2 subgroup.A difference in the magnitude of treatment benefit between groups stratified according to liver function was considered potentially indicative of a predictive signal.

Analysis
The prognostic and predictive potential of liver function was analysed descriptively by examination of visual and numerical trends within the data; no statistical tests were undertaken.No meta-analysis of the data was planned (or conducted) owing to expected (and confirmed) small sample sizes and data heterogeneity, including differences in population eligibility, ecology of care, liver function grading system.All relevant outcome data were analysed graphically (Fig. 2-6; online suppl.Fig. S1-4), grouped by outcome and data format; key observations are described in the Results.

Role of the Funding Source
This systematic literature review was sponsored by Ipsen.Ipsen had no input into the design of the review, or into the analysis or interpretation of results.Ipsen sponsored the development of the manuscript in accordance with Good Publication Practice guidelines.

Search Results
The systematic searches identified 2,579 unique publications, of which 70 were short-listed for eligibility based on title/abstract screening and 12 were confirmed as eligible based on full-text review (eight full texts, two congress abstracts, two congress posters).Handsearching congress websites identified two additional eligible abstracts.During the data-analysis phase of the review, the authors identified a further six eligible publications (four full texts, one congress poster, one congress oral abstract).In total, 20 publications were included in the review (12 full texts, eight congress publications [three text-only abstracts, four posters, one oral presentation]).Publication identification and screening are summarized in the PRIS-MA flow chart (Fig. 1).
No eligible publications reported high-quality data according to the GRADE appraisal because of methodological limitations such as the post hoc nature of the analyses and/or the small population/subgroup sizes considered and/or (in the case of some phase 2 trials) the lack of randomization.Ten publications (50%) reported moderatequality evidence [23,25,[32][33][34][35][36][37][38]40], indicating that further research is likely to have an important impact on confidence in the reported results, and nine (45%) lowquality evidence [7, 23, 24, 26, 28-31, 39, 41], indicating that further research may change the reported effect estimate.One publication, reporting an analysis of a nonrandomized phase 2 trial, offered very-low-quality evidence [27], indicating a high degree of potential uncertainty in the reported estimates (online suppl.Table S4).

Efficacy: Prognostic Data
Descriptive analyses of the data pertaining to the prognostic potential of liver function assessments (i.e., those comparing within-trial arm outcomes, stratified by baseline liver function) were based on examination of trends within the data; no statistical tests were undertaken.In all analyses of survival estimates, the associated confidence intervals (CIs) were wide and overlapping.

Control (Placebo) Group Analyses
Overall, five publications reported median OS estimates for subgroups defined by baseline liver function within the placebo arms of the included trials.Examination of trends within the data consistently showed numerically shorter median OS in patients with poorer liver function at baseline (i.e., in ALBI 2 vs. ALBI 1 [29,33,40], Child-Pugh 6 vs. 5 [29,38], and Child-Pugh 7 + 8 vs. 6 and vs. 5 [38]) (online suppl.Fig. S1).Across the three trials that reported median OS by ALBI grade [29,34,40], median OS ranged from 6.6 months to 11.4 months for ALBI 1 subgroups [29,40] and from 4.2 months to 11.1 DOI: 10.1159/000529173 months for ALBI 2 subgroups [29,34].For the two trial analyses that reported median OS for Child-Pugh subgroups [29,38], median OS values were 6.4 months and 9.7 months for the Child-Pugh 5 subgroups and 4.1 months and 4.8 months for the Child-Pugh 6 subgroups.In the single study that reported mean OS for patients with Child-Pugh 7/8 at baseline, median OS was 3.8 months [38].No relationship was observed between baseline liver function and OS for patients enrolled in the control arms of the phase 3 KEYNOTE-240 trial [34] (online suppl.Fig. S1) or for the small phase 2 Scoop-2 study [27].
Only three eligible publications reported PFS for subgroups defined by baseline liver function within the relevant trials' placebo arms [34,38,40] (online suppl.Fig. S2).Of these, two found no effect on PFS, while the other showed a potential relationship between poorer baseline liver function and reduced median PFS [38]: Child-Pugh 5, 2.5 months; Child-Pugh 6, 2.1 months; and Child-Pugh 7, 1.4 months.

Active Treatment Analyses
In total, 15 trial publications reported median OS estimates for subgroups defined by baseline liver function within the active treatment arms.Similar to the findings reported for the placebo arms, examination of the active treatment arm data showed a consistent trend towards shorter median OS in patients with poorer baseline liver function (Fig. 2; online suppl.Fig. S3).This trend was observed for the majority of trials and broadly presented whether baseline liver function was assessed using the ALBI system (e.g., ALBI 2 vs. 1 [7, 23, 29, 33-35, 39, 40], ALBI 3 vs. 2 and vs. 1 [7,29], and ALBI 2b vs. 2a and vs. 1 [39]) or the Child-Pugh system (e.g., Child-Pugh B vs.
Not included in Figure 2, owing to a lack of published median OS estimates, was a phase 2 trial of sorafenib versus bevacizumab plus erlotinib [32].Examination of the Kaplan-Meier graphs in the trial publication, however, suggested a consistent relationship between poorer baseline liver function and shorter OS because median OS was approximately 12 months for patients with a Child-Pugh score of A compared with approximately 6 months for those with Child-Pugh B7.
In a post hoc analysis of the REFLECT trial of lenvatinib versus sorafenib, median OS was shorter in patients whose liver function deteriorated from Child-Pugh grade A to B (vs. remained as grade A) over the first 8 weeks after randomization [26].In the lenvatinib arm, median (95% CI) OS was 6.8 (2.6-10.3)months in patients whose liver function deteriorated from Child-Pugh A to B over the 8-week period (n = 60) compared with 13.3 (11.6-16.1)months for patients who still had Child-Pugh grade A at week 8 (n = 413).Similarly, median (95% CI) OS was 4.5 (2.9-6.1)months in the subgroup of patients in the sorafenib arm whose liver function had deteriorated from Child-Pugh A to B (n = 47) by week 8 compared with 12.0 (10.2-14.0)months for those who retained Child-Pugh grade A status throughout the 8-week period [26].A similar analysis was conducted for the CELESTIAL trial population, but no efficacy (only safety) outcomes were reported for liver-function-defined subgroups [41].These post    2 owing to the difference in their analysis approach, which assessed the relationship between rate of liver function decline (rather than extent of dysfunction at baseline) and patient outcomes.
The relationship between baseline liver function and PFS was evaluated for the active treatment arms of nine of the eligible publications (Fig. 3).As for median OS, there was a possible trend towards shorter median PFS in patients with poorer baseline liver function, but the signal was weaker than for OS (Fig. 3 vs.Fig.2, respectively).Across the six trials that reported median PFS by ALBI grade [23,34,35,39,40], median PFS across subgroups ranged from 2.8 months to 8.8 months for the ALBI 1 subgroups and from 3.2 months to 5.6 months for the ALBI 2 subgroups (Fig. 3a).Across the trials that reported median PFS by Child-Pugh score, median PFS ranges were Child-Pugh A, 4.2-4.4months; Child-Pugh B, 2.1-2.6 months; Child-Pugh 5, 3.7-7.3months; and 2.7-7.4 months for Child-Pugh 6 (Fig. 3b).
The signal for shorter median PFS in patients with poorer baseline liver function was apparent for the cabozantinib arms of the CELESTIAL trial [40] and for the lenvatinib arm of the REFLECT trial [35] but was not evident for the sorafenib arm of REFLECT (Fig. 3a).Considering sorafenib specifically, there was also no apparent relationship between the baseline liver function and median PFS duration in the sorafenib arm of the phase 3 IMbrave150 trial [39], but there was a possible signal in the sorafenib arm of the phase 3 SUN 1170 trial [23] (Fig. 3a) and a stronger signal in two phase 2 trials by Abou-Alfa et al. [24] and Pressiani et al. [30] (Fig. 3b).
A trend towards shorter median PFS in patients with poorer baseline liver function was also visible in the REACH and REACH-2 trials of ramucirumab [37,38] and in the combination atezolizumab plus bevacizumab arm of the IMbrave150 trial [39] (Fig. 3a).No trend between poorer baseline liver function and shorter median PFS was evident in the KEYNOTE-240 trial of pembrolizumab [34] (Fig. 3a).Efficacy: Predictive Data Data pertaining to the predictive potential of liver function assessments (i.e., those reporting survival outcomes for the experimental treatment vs. control arm, stratified by baseline liver function) were available from seven trials (five trials with placebo control arms [25,29,33,34,40] and two with active control arms [35,39]).Predictive analyses were based on descriptive assessment of potential trends between HR for the experimental versus control arms within each individual trial; no statistical tests were undertaken.In all seven trials, the CIs around the HRs were wide and largely overlapping.
Despite the modest signal for greater treatment benefit in patients with more preserved liver function, the evidence to support liver function as a predictor of systemic treatment response was less strong than for its CELESTIAL 40  cabozantinib versus placebo REFLECT 35  lenvatinib versus sorafenib REACH-2 25  ramucirumab versus placebo IMbrave150 role as a prognostic variable.In the predictive analyses, the directional benefit of the HRs for the preserved liver function and liver dysfunction subgroups was consistent and mirrored that for the overall populations, both for the endpoint of death (OS, Fig. 4) and for disease progression (PFS, Fig. 5).Taken together, these data suggest that patients with more preserved liver function may derive modestly greater benefit from systemic therapy, but overall, baseline liver function did not appear to have substantial predictive value.

Overall Survival
Consistent, yet modest, differences in OS HRs for subgroups defined by baseline ALBI grade were seen in the data from the CELESTIAL trial of cabozantinib [40], the RESORCE trial of regorafenib [33], the REACH-2 trial (and pooled analysis of the REACH/REACH-2 trials) of ramucirumab [25,29], the IMbrave150 trial of atezolizumab plus bevacizumab versus sorafenib [39], and from the KEYNOTE-240 trial of pembrolizumab [34] (Fig. 4).Two publications, Vogel et al. and Brandi et al. [25,35], evaluated the differences in OS HRs according to degree of liver dysfunction using both the ALBI and Child-Pugh systems, providing cursory insight into the relative potential of the two systems to predict OS.In both cases, there was a suggestion that ALBI may provide better predictive discrimination than Child-Pugh categorization, but there was wide uncertainty in the data (Fig. 4; online suppl.Fig. S4a).Progression-Free Survival Modestly higher HRs for patients with less (vs.more) preserved liver function were seen in the PFS analyses of five trials: the CELESTIAL trial of cabozantinib [40], the REFLECT trial of lenvatinib versus sorafenib [35], the REACH-2 trial of ramucirumab [25,29], the IMbrave150 trial of atezolizumab plus bevacizumab versus sorafenib [39], and the KEYNOTE-240 trial of pembrolizumab [34] (Fig. 5).In the two publications that evaluated PFS for subgroups defined according to the ALBI and Child-Pugh classification systems (Vogel et al. and Brandi et al. [25,35]), similar to the OS analyses, any modest separation of the HRs according to degree of liver dysfunction was more evident in the ALBI-based analyses (Fig. 5; online suppl.Fig. S4b).

Safety
As illustrated by the results of the prognostic and predictive analyses, patients with more preserved liver function are more likely to have longer time on treatment and, thus, longer time to accrue treatment-emergent and treatment-related adverse events (TEAEs and TRAEs).Relative rates of safety outcomes for different baseline liver function subgroups within the active treatment arms of the included publications were therefore described to explore the potential for liver function assessment to predict on-treatment safety signals, considering particular safety events adjudicated as being treatment related by trial clinicians and/or independent review boards.Overall, 13 eligible trials reported safety outcomes for different baseline liver function subgroups.The safety data for the active treatment arms of the included trial publications are summarized in Tables 1 and 2 and Figures 6 and in online supplementary Table S5 for the placebo arms.
More evident than a relationship between AEs and baseline liver function in the included publications was a potential relationship between treatment discontinuation and baseline liver function (Fig. 6c, d).Rates of treatment discontinuation (whether recorded as treatment related or any, depending on the source publication) were consistently higher in patients with poorer baseline liver function status.This potential signal was consistent across all investigational TKI [30,33,35,39,40], mAb [29,39], and CPI regimens, with the exception of the Check-Mate 040 evaluation of nivolumab, in which no difference was seen in discontinuation rates related to TRAEs, potentially owing to lack of power [28].

Conclusions
This systematic literature review was designed to identify and describe data trends in clinical trials of systemic anticancer therapies in patients with aHCC that reported outcomes stratified by patients' baseline liver function.The aim was to explore the potential trends in the data indicative of a role for baseline liver function assessments in predicting not only the disease course of patients with aHCC but also their response to therapy.
Twenty publications were identified that reported trial outcomes stratified by baseline liver function.All 20 publications reported efficacy outcomes; 13 also reported safety outcomes.
Comparisons of efficacy outcomes between subgroups defined by baseline liver function corroborated previous assertions of liver function as a prognostic factor for patients with HCC receiving systemic treatment [5,6].In both the control and active treatment arms of the included trials, there were consistent trends for shorter survival in patients with greater liver dysfunction at baseline.The signal was stronger for OS than for PFS, which is not unexpected, given that OS estimates can be influenced by post-progression therapy and that survival may be extended by the use of effective post-progression treatment approaches in patients who maintain good liver function after initial systemic therapy.It is important to note that patients may have several reasons for a deterioration in liver function, including progression of underlying liver cirrhosis, treatment-related toxicity, and/or tumour progression.For an individual patient, the exact cause of their liver function deterioration may be multifaceted and challenging to establish.It should also be noted that the risk of death resulting from the natural history of cirrhosis is a potential confounder of PFS outcomes in HCC trials; time to progression may be a better surrogate endpoint for evaluating the benefits of effective drugs [1].
Of greater novelty in this review was the suggestion that liver function assessments may have modest predictive value with respect to systemic treatment response.In a number of the included trials of efficacious aHCC systemic therapies, there was separation of the HRs (for OS and PFS) for different baseline ALBI subgroups, favouring patients with more preserved liver function.However, the HRs consistently indicated treatment benefit (vs.respective comparators) across preserved liver function and liver dysfunction subgroups and across endpoints.These findings mirrored those for the overall populations and suggest that liver function has limited value as a predictive biomarker.Thus, overall, despite a possible signal for greater treatment benefit in patients with more preserved liver function, the data did not suggest substantial predictive utility of liver assessment for predicting differential treatment response; the direction of OS and PFS HRs (vs.comparator therapies) was consistent across the preserved liver function, the liver dysfunction subgroups, and the overall populations.There were limited data to inform a comparison of the relative discriminatory ability of the ALBI and Child-Pugh systems, but two publications provided a possible signal for ALBI being more sensitive than Child-Pugh in discerning the modest potential predictive utility of baseline liver function [25,33].
Despite a lack of consistency in AE reporting across the included trials (some reported any AEs, others TEAEs or TRAEs), among patients receiving active treatment, there was generally a trend for higher AE rates in those with greater baseline liver dysfunction.More pronounced than the signals in the AE data was the apparent relationship between severity of liver dysfunction and rates of treatment discontinuation among patients receiving active therapy.By contrast, data from the large, observational GIDEON study of sorafenib suggested that AE incidence was consistent across Child-Pugh subgroups and reported no substantial differences in discontinuations resulting from drug-related AEs in the real-world setting [42].The prognostic utility of liver function assessments described is in general agreement with the existing literature and with the clinical understanding that patients with better liver function are often able to tolerate more lines of treatment and can experience prolonged survival.This finding, together with the novel suggestion of a possible modest signal for greater (and/or prolonged) benefit of systemic therapy in patents with more preserved liver function at the time of treatment initiation, underscores the clinical relevance of liver function assessments for informing clinical practice.These findings emphasize the importance of close monitoring and management of co-existing liver disease in patients receiving local and systemic treatment and of exploring the reason behind liver function deterioration (be it underlying cirrhosis, tumour progression, and/or treatment toxicity).Additionally, they highlight the value of trying to preserve (or improve) liver function, where feasible, and the relevance of considering liver function when deciding whether to transition patients from local to systemic therapy.An algorithm has recently been proposed that uses liver functional reserve as the starting point for a more systematic approach to decision-making around non-surgical HCC treatments [43].
Within the research context, this review confirms the utility of liver function as a stratification factor in clinical trials of patients with aHCC.Adequately powered analyses of liver-function-defined subgroups could help to refine the optimum assessment approach and allow corroboration (or not) of the potential predictive utility of liver function assessments reported here.In addition, the review suggests value in conducting dedicated safety and efficacy studies in patients with greater extents of hepatic dysfunction due to the potential impact of liver function on treatment response as well as prognosis.
A limitation of the review was its reliance on data from aHCC trials involving populations with relatively preserved liver function, a reflection of the convention for trials of investigational aHCC therapies to restricted recruitment to Child-Pugh A liver disease and to exclude patients with significant liver dysfunction.Regardless, the eligible trials did permit some degree of population stratification by extent of liver function, but the associated outcome analyses were frequently post hoc and generally involved underpowered subgroups.As a result, although only publications reporting data from phase 2 and 3 trials were eligible for inclusion, the majority of papers provided moderate-, low-, or very-low-grade evidence.These quality ratings reflect a number of limitations and areas of potential bias in the source data, such as small sample sizes, lack of randomization, the post hoc nature of many of the analyses, and/or the pooling of data across multiple trials involving populations with differing baseline characteristics.Indeed, all survival estimates had wide-associated CIs, which were largely overlapping between subgroups.Some trials did not support a relationship between liver function and disease or treatment outcomes, which may be due, in part, to more favourable safety profiles of individual agents in patients with greater liver dysfunction and to potential differences in the extent of liver function chronicity and irreversibility between the populations of the included trials.
Another limitation of the review is the heterogeneity of the trials included (involving different interventions, control arms, and patient populations), as well as differences in liver function assessment approaches (both at the system and individual-clinician level).For these reasons, direct interpretation of the data should focus on withintrial analyses, and descriptive trends in the data should be considered as hypothesis generating only.No formal statistical analyses were undertaken or were possible.
In conclusion, this review corroborates liver function as a valuable prognostic variable in patients with aHCC and suggests additional, modest utility for predicting extent of benefit from systemic therapy for aHCC.Overall, the observations support the inclusion of liver function assessments with discriminatory potential among the a priori stratification factors used in future HCC trials.

Fig. 1 .
Fig. 1.PRISMA diagram of included and excluded studies in the systematic literature review.PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses.

Table 1 .
Safety outcomes of patients receiving active treatment, stratified by liver function subgroups defined by ALBI grade

Table 2 .
Safety outcomes of patients receiving active treatment, stratified by liver function subgroups defined by Child-Pugh score

Table 2
(continued) AESI, adverse event of special interest; ALBI, albumin-bilirubin grade; ALT, alanine aminotransferase.AST, aspartate aminotransferase.CP, Child-Pugh score.CPI, checkpoint inhibitor; GI, gastrointestinal; HFSR, hand-foot skin reaction; LFT, liver function test; mAb, monoclonal antibody; NCT, National Clinical Trial; NR, not reported; PPE, palmar-plantar erythrodysesthesia; TEAE, treatment-emergent adverse event; TKI, tyrosine kinase inhibitor; TRAE, treatment-related adverse vent; UMIN, University hospital Medical Information Network.* Retrospectively analysed data from patients in CELESTIAL whose cirrhosis evolved to Child-Pugh B by week 8 versus overall population (eligible patients had baseline CP A liver function).† Post hoc exploratory analysis of key efficacy and safety outcomes in patients from REFLECT whose liver function had deteriorated to CP B versus those whose liver function remained CP A in the 8 weeks after randomization.
‡Stratified results according to ALBI grade and by Child-Pugh score.