Characterizing Surgical and Radiotherapy Outcomes in Non-metastatic High-Risk Prostate Cancer: A Systematic Review and Meta-Analysis

Background Identifying the optimal management of high-risk non-metastatic prostate cancer (PCa) is an important public health concern, given the large burden of this disease. We performed a meta-analysis of studies comparing PCa-specific mortality (CSM) among men diagnosed with high-risk non-metastatic PCa who were treated with primary radiotherapy (RT) and radical prostatectomy (RP). Methods Medline and Embase were searched for articles between January 1, 2005, and February 11, 2020. After title and abstract screening, two authors independently reviewed full-text articles for inclusion. Data were abstracted, and a modified version of the Newcastle-Ottawa Scale, involving a comprehensive list of confounding variables, was used to assess the risk of bias. Results Fifteen studies involving 131,392 patients were included. No difference in adjusted CSM in RT relative to RP was shown (hazard ratio, 1.02 [95% confidence interval: 0.84, 1.25]). Increased CSM was found in a subgroup analysis comparing external beam radiation therapy (EBRT) with RP (1.35 [1.10, 1.68]), whereas EBRT combined with brachytherapy (BT) versus RP showed lower CSM (0.68 [0.48, 0.95]). All studies demonstrated a high risk of bias as none fully adjusted for all confounding variables. Conclusion We found no difference in CSM between men diagnosed with non-metastatic high-risk PCa and treated with RP or RT; however, this is likely explained by increased CSM in men treated with EBRT and decreased CSM in men treated with EBRT + BT studies relative to RP. High risk of bias in all studies identifies the need for better data collection and confounding control in the PCa research.


Introduction
Prostate cancer (PCa) was the second most frequently diagnosed cancer and the fifth leading cause of cancer death worldwide as of 2018 [1]. High-risk PCa -as defined by a clinical stage ≥ T3, Gleason score of 8-10, or prostate-specific antigen (PSA) > 20 ng/ml at the time of diagnosis [2] -accounts for approximately onequarter of all PCa diagnoses but was responsible for a disproportionately larger share of PCa-specific mortality (CSM) [3]. Optimal selection and sequencing of therapy for high-risk non-metastatic PCa, such as the choice between radical prostatectomy (RP) and radical radiotherapy (RT), remain an area of intense academic and clinical debate [4]. Unfortunately, no randomized controlled trials (RCTs) on this topic have been completed due to the low patient and provider equipoise surrounding RP and RT, especially in North America [5,6]. As such, investigations comparing RP and RT outcomes have mostly been performed using non-randomized data. In the absence of RCTs, meta-analyses that summarize high-quality non-randomized data can inform treatment decisions for physicians and policymakers.
Previous meta-analyses that have compared mortality outcomes between patients diagnosed with PCa and treated with RP or RT involved studies that compared older treatment approaches, which greatly differ from current standards of care [7]. Publications included in these meta-analyses have since been updated to include longer follow-up periods of more contemporary RT approaches such as dose-escalation protocols for external beam radiation therapy (EBRT), use of brachytherapy boost (BT), and adjuvant androgen deprivation therapy (ADT) [8][9][10][11], which may lead to better oncological outcomes for men diagnosed with high-risk non-metastatic PCa [12][13][14]. Although a more recent meta-analysis has been conducted [15], numerous errors were made, limiting the utility of the aggregated effect estimates for use in clinical practice. For instance, multiple effect estimates were generated from overlapping data [9,[16][17][18][19][20][21][22] leading to some patient data overinfluencing aggregate effect estimates as well as the inclusion of a study investigating lowrisk PCa [10]. Moreover, the authors aggregated studies involving patients diagnosed with non-metastatic and nodal metastatic high-risk PCa [19], which have heterogeneous disease trajectories and ultimately call for different management approaches that are not comparable [23].
The objective of this study was to compare the relative rates of CSM and ACM between men diagnosed with high-risk non-metastatic PCa and treated with RP or RT as their primary treatment modality.

Research question
The primary and secondary objectives of the study were to summarize the relative CSM and ACM, respectively, of patients diagnosed with non-metastatic high-risk PCa treated primarily with either RP or RT.

Protocol and search strategy
The systematic review was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [24]. The review protocol has been registered in the International Prospective Register of Systematic Reviews (PROSPERO) (registration number: CRD42020150710). The search strategy is provided in Appendix 1. Studies were included in our analysis if they were published between January 1, 2005, and February 11, 2020, to limit attention to analyses of more contemporary treatment periods. Only full-text articles published in English in a peer-reviewed journal were considered.
We included only cohort studies in our review since case-control studies could not typically evaluate hazard ratios. Furthermore, previous RCTs were excluded due to insufficient numbers of men diagnosed with nonmetastatic high-risk PCa to form valid inferences [25]. Editorials, letters to the editor, commentaries, guidelines, and review articles were also excluded.
We included studies that reported on men of any age diagnosed with non-metastatic high-risk PCa, according to the National Comprehensive Cancer Network (clinical stage ≥ T3, Gleason score of 8-10, or prostate-specific antigen > 20 ng/ml) [2] or D'Amico criteria (clinical stage ≥ T2c, Gleason score 8-10, or prostate-specific antigen > 20 ng/ml) who were treated with either primary RP or RT [26]. All common forms of RP (e.g., open retropubic, laparoscopic, and robotic) and RT (e.g., conformal external beam, intensitymodulated, brachytherapy, or combination of radiotherapy modalities with curative intent) were considered. Studies assessing adjuvant or salvage therapies as the primary objective were excluded. We included only studies that provided a hazard ratio for CSM or ACM, both adjusted for confounding. Studies reporting on surrogate outcome measures such as biochemical progression were excluded since definitions for RP and RT differ.

Article review
The first phase of the project involved title and abstract review by DG to discard non-relevant citations and duplications. Full-text reviews of the remaining studies were examined in the second phase by DG and HC to determine eligibility for inclusion based on pre-determined criteria. Afterward, DG and HC independently reviewed the records, and GBR settled discrepancies on the inclusion/exclusion of certain records. When more than one publication existed using the same patient population, the most relevant, updated, and complete publication was selected. A diagram describing the study flow is outlined in Figure 1.

Data extraction and risk of bias assessment
A data extraction form was completed for each study as outlined in Appendix 2. We used a modified Newcastle-Ottawa Scale to include a comprehensive list of items identifying confounding variables (see Appendix 3). Confounding variables included those relating to tumor characteristics (baseline PSA, Gleason score, and clinical stage), age, comorbidity status, year of diagnosis or treatment, study center (if multiple), and at least one demographic characteristic (e.g., education, income, rural or urban residence). This list was reviewed and approved by both a radiation oncologist (GR) and an uro-oncologist (JC).

Publication bias
We assessed publication bias using funnel plots and the Egger test. Hazard ratios from included studies were plotted as a function of their standard error in relation to the aggregate effect estimate generated through random-effects models. Residual values were also estimated using mixed-effects models to account for heterogeneity due to moderator variables (RT approach for CSM and ACM, and age for ACM) in order to improve interpretation of funnel plots for the assessment of publication bias.

Assessment of heterogeneity
The Q-test was performed to identify significant heterogeneity in treatment effect estimates, using the DerSimonian-Laird method, and quantified through the I2 statistic [27].

Statistical analysis
General study information, PCa treatment and endpoint information, and methodological information were categorized into tables using frequency or proportions for categorical variables, medians or means for continuous variables, and descriptive terms for other variables where appropriate.
The meta-analysis was performed in R statistical software (x64, version 3.3.2; R Foundation for Statistical Computing, Vienna, Austria) with the "metafor" package (version 1.9-9) [28]. The primary meta-analysis comparing CSM between RP and RT was carried out using inverse variance-weighted random-effects models. We then performed a series of univariable meta-regression to explore sources of heterogeneity. Input variables included treatment era (examined as a binary variable with values of 1 and 0 for values above and below the median year of diagnosis, respectively), approach to RT (EBRT with or without brachytherapy boost), length of follow-up (examined as a binary variable with values of 1 and 0 for values above and below the median, respectively), geographical location (the United States versus other), and age (examined as a binary variable with values of 1 and 0 for values above and below the median, respectively). Insufficient data were available to explore the effect of RT dose, RP approach (i.e., open, laparoscopic, robotic), the proportion receiving systemic therapy (i.e., ADT, chemotherapy, and adjuvant RT), and type of EBRT (i.e., 3D conformal, IMRT, etc.). All statistical tests were two-sided with significance levels of <0.05.

Results
Fifteen studies involving 131,932 total patients were identified for inclusion. The article selection flowchart is outlined in Figure 1. Table 1 shows the characteristics of individual studies. Four studies compared treatment groups from a single institution, another four studies compared groups from different institutions, another five studies used national registries to compare treatment groups, and two studies made comparisons across multiple institutions. Patient characteristics varied across studies due to variations in inclusion and exclusion criteria. In general, RT patients were older, had a greater number of comorbidities, and had poorer prognostic characteristics. Median follow-up varied substantially between studies and treatment groups. Treatment details were scarcely reported for the RP group, while details regarding RT dose, the proportion receiving ADT, and whether EBRT was performed in conjunction with BT were provided in most studies.

Author
Year Treatment   The overall risk of bias was high for all studies ( Table 2) as none adjusted for all potential confounders. Most studies had a low risk of bias for the 'selection' section other than those comparing the treatment groups from tertiary centers. The 'comparability' section varied due to variation in covariate control. All studies controlled for age; most studies provided adequate control for tumor characteristics (i.e., PSA, clinical stage, and Gleason score) (14/15), while fewer studies controlled for comorbidities (8/15), demographic characteristics (5/15), and study center (8/15). Finally, most studies did not have a sufficient median followup, leading to a score of 2/3 for the 'outcome' section for 13/15 studies. There was no indication of publication bias. The Egger test for publication bias was not statistically significant (p = 0.21 for CSM and 0.88 for ACM; Figure 2).

Prostate cancer-specific mortality
Ten studies with 88,026 patients were included in the primary meta-analysis for CSM. The resulting adjusted hazard ratio [95% confidence interval] was 1.02 [0.84, 1.25] with substantial heterogeneity (I 2 = 69%) as shown in Figure 3A. Subgroup analysis revealed a significant effect by the RT approach (p < 0.0001  Table 3). This was also associated with decreased, though still substantial, heterogeneity (I 2 = 59% and 47%, respectively). The remaining subgroup analyses did not differ notably from the primary analysis.

All-cause mortality
Eight studies with 116,975 patients were included in the secondary meta-analysis for ACM. The resulting adjusted HR [95%CI] was 1.23 [0.93, 1.61] with substantial heterogeneity (I 2 = 94%) as shown in Figure 3B. Subgroup analysis revealed a significant effect by the RT approach (p = 0.02  Table 3).
Both subgroup analyses were associated with substantial heterogeneity (I 2 = 90% and 89%, respectively). Subgroup analysis by median age also revealed a significant effect (p < 0.0001

Discussion
Our aggregate effect estimates for adjusted CSM showed no statistically significant differences between RP and RT for high-risk non-metastatic PCa patients. Subgroup analysis revealed a significantly increased incidence of CSM among men treated with EBRT ± ADT relative to the RP group and a decreased incidence of CSM among men treated with EBRT + BT ± ADT relative to the RP group. This is consistent with the results from the ASCENDE-RT trial (androgen suppression combined with elective nodal and dose-escalated radiation therapy) wherein an increased incidence of biochemical failure was found among men diagnosed with intermediate-and high-risk non-metastatic PCa and treated with dose-escalation RT protocols using EBRT alone compared with those using combination EBRT + BT (HR [95%CI]: 2.04 [1.25, 3.33]) [13]. Although biochemical failure is not an accepted surrogate and CSM was not significantly different between these groups, the remaining subgroup analyses did not differ from the primary analysis.
Multiple reports indicate that since the early 2000s, the use of BT boost in high-risk patients has declined in the United States [40] and other geographic regions [41]. However, the use of prostate BT boost has increased since the early 2000s in certain European centers and Canada interestingly [42,43]. This discrepancy may be attributable to differences in resident exposure in providing sufficient training opportunities, given the steep learning curve associated with administering BT [44][45][46] and unfavorable reimbursement relative to EBRT in the United States relative to publicly funded healthcare systems [42,47]. Given the CSM benefit associated with BT boost among high-risk patients reported in RCTs and estimated here, we encourage investment in overcoming the aforementioned obstacles through increasing resident exposure and improving reimbursement models to encourage the use of BT boost.
The HR comparing the relative incidence of CSM between EBRT ± ADT and RP groups was smaller compared to that in a previous meta-analysis performed [48]. These differences might be explained by more recent changes in treatment approaches including the increasing use of dose-escalation protocols and adjuvant ADT paired with RT [40,41], which have both demonstrated improvements in oncological outcomes, though only the addition of neoadjuvant ADT to RT has demonstrated improvements in CSM [7,12,41].
The analysis of relative ACM between RT and RP also revealed no significant difference between the treatment groups. However, subgroup analysis revealed a significantly increased incidence of ACM among the EBRT ± ADT relative to the RP group, while there was an insignificant decrease in ACM between the EBRT + BT ± ADT and RP groups. In addition to the CSM benefit afforded through RP and EBRT + BT ± ADT relative to EBRT ± ADT, differences in cardiopulmonary health requirements before undergoing general anesthetic that is required for RP and BT and lack of control for comorbidities in many of the included studies might contribute to the observed differences. Studies conducted among younger age groups demonstrated an increased incidence of ACM in the RT relative to the RP group. Finally, a tendency toward increased incidence of ACM in the RT relative to the RP group was also noted among studies conducted only in the United States. However, this is likely explained by the greater proportion of comparisons with RP involving EBRT ± ADT instead of EBRT + BT ± ADT among studies performed in the United States versus other geographic locations.
Overall, the risk of bias was deemed high for all studies due to the partial control of confounding variables. This stands in contrast with a previous meta-analysis performed by Wallis et al. who found a low to moderate risk of bias for all studies included in their meta-analysis comparing the rate of ACM and CSM between patients who underwent RT and RP. Interestingly, four studies used in both analyses indicated perfect comparability between RT and RP groups by Wallis et al., yet some of these studies did not control for study center [37][38][39], year of diagnosis [35,37,38], or demographic characteristics [38]. Since patients undergoing RT are more likely to be older, have poorer prognostic characteristics, and have sociodemographic characteristics that are associated with poorer CSM and ACM [11,20,29], we anticipate the influence of these unaccounted-for biases to overestimate CSM and ACM in the RT group relative to the RP group. However, the discrepancy in such baseline characteristics appears more prominent among those undergoing EBRT ± ADT rather than EBRT + BT ± ADT wherein patients are more similar to those undergoing RP [11,20]. As such, collecting information on these variables and properly controlling them are crucial when estimating relative treatment effects between groups to more accurately inform treatment decisions.
Our study has certain limitations. There was a high level of heterogeneity in effect estimates. This was substantially reduced through subgroup analyses comparing RP with EBRT ± ADT and EBRT + BT ± ADT, and among comparisons involving younger populations, heterogeneity still remained high and was unaccounted for through additional subgroup analyses. Unfortunately, information surrounding treatment details such as RT dose, type of EBRT (i.e., 3D conformal, IMRT, etc.), use of adjunct therapies, and surgeon experience, which might account for a large proportion of this heterogeneity, was missing in many of the studies.
The aggregated effect estimates provided in this study can be used to inform clinical decisions in conjunction with evidence surrounding quality of life outcomes. Given the relatively small difference in CSM between treatment approaches, other factors such as patient preferences, patient health (i.e., comorbidities), and treatment factors (e.g., operative risk and prostate volume for BT) should be considered when forming treatment decisions. This should occur through a shared decision-making process, involving the patient and providing urologists and radiation oncologists to optimize satisfaction in patient outcomes.

Conclusions
We identified no significant difference in the relative rate of CSM between patients diagnosed with high-risk non-metastatic PCa and treated with RP relative to RT. However, there was a significant subgroup effect with the use of EBRT + BT ± ADT, highlighting the necessity of differentiating RT with or without BT in future comparative effectiveness studies. The high risk of bias in all studies reviewed emphasizes the need for better control of all potentially confounding variables to provide higher quality non-randomized evidence. This is exceedingly important when RCTs are unlikely to be feasible in this patient population.

Appendix 3: Modified Newcastle-Ottawa scale for risk of bias assessment
Items having the potential to bias the relationship between treatment modality (i.e., radical prostatectomy (RP) or radiation therapy (RT)) and outcomes of interest (i.e., cancer-specific or overall survival).

Selection
1. Representativeness of the exposed cohort a. 1 point for data representing the general population (i.e., in terms of socioeconomic and demographic characteristics) b. 0 points if data is not representative or indicated (e.g., selected group of users like nurses, volunteers, insured, safety-net hospitals, secondary data from other clinical populations, etc.) 2. Representativeness of the non-exposed cohort a. 1 point if drawn from the same community as the a. 1 point if no subjects lost to follow-up or those lost are unlikely to introduce bias (i.e., number lost ≤ 20% or description of those lost suggested no different from those followed) b. 0 points if follow-up rate < 80% and no description of those lost or if no statement was made 8. Was follow-up long enough for outcomes to occur? a. 1 point if median follow-up was ≥10 years, as 10-year cancer-specific survival is estimated to be 88% in patients diagnosed with high-risk PCa undergoing multimodal treatment [3].
Thresholds for converting to low, moderate, and high risks of bias: This scoring system is adapted from the Newcastle Ottawa Scale. We gave more weight to Item 5 as these confounding variables have demonstrated a substantial impact on the comparison between RP and RT and overall and cause-specific mortality in prostate cancer research [4].