Dose-Volume Predictors of Radiation Pneumonitis After Lung Stereotactic Body Radiation Therapy (SBRT): Implications for Practice and Trial Design

Background and purpose Recently published HyTEC report summarized lung toxicity data and proposed guidelines of mean lung dose (MLD) <8 Gy and normal lung receiving at least 20 Gy, V20Gy<10-15% to avoid lung toxicity. Support for preferred use of a particular dosimetric parameter has been limited. We performed a detailed dose-volume analysis of data on radiation pneumonitis (RP) following lung stereotactic body radiation therapy (SBRT) to search for parameters showing the strongest correlation with RP. Materials and methods Two patient cohorts (primary and metastatic lung tumor patients) from previously reported studies were analyzed. Total number of patients was 96, and incidence of grade ≥2 RP was 13.5% (13/96). Fitting to the logistic function was performed to investigate correlation between incidence of RP and reported dosimetric and volumetric parameters. Another independent cohort was used to explore correlation between dosimetric parameters. Results Among normal lung parameters (MLD and reported Vx), only MLD consistently showed significant correlation with incidence of RP. Gross tumor volume (GTV), internal target volume, planning target volume (PTV), and minimum dose covering 95% of GTV or PTV did not show statistical significance. A significant correlation between reported Vx and MLD was observed in all cohorts. Conclusions In considering tumor- and target-specific (e.g., GTV, PTV) and normal lung-specific (e.g., MLD, Vx) metrics, MLD was the only parameter that consistently correlated with incidence of RP across both cohorts. Because SBRT planning constraints allow small normal lung volumes to receive high doses, utility of MLD is not obvious. The parallel structure of lung is one possible explanation, but correlation between dosimetric parameters obscures elucidation of the preferred or mechanistically based parameter to guide radiotherapy planning.


Introduction
It has been 10 years since the QUANTEC (Quantitative Analyses of Normal Tissue Effects in the Clinic) reports have been published [1]. This effort has been followed by forming two more "TEC" working groups, HyTEC (Hypofractionated/High Dose per fraction), which is near completion with a number of papers published [2][3][4], and PENTEC (Pediatric Normal Tissue Effects in the Clinic), which is currently in progress [5]. Each working group has been assigned a similar set of tasks: to summarize available literature, to model normal tissue complication probability (NTCP) based upon pooled published data, to provide dosimetric recommendations, and to inform future studies. One of the main challenges has been to find and justify preferred metrics associated with outcomes. This task is particularly challenging not only because the published literature is inconsistent with respect to the scales used to grade toxicities as well as dosimetric parameters analyzed, but also because patient-specific data, which would allow for pooling more granular data, are rarely provided [2]. Whenever possible, hypothesis-generating studies to search for preferred metrics are needed to justify radiotherapy planning guidelines.
Stereotactic body radiation therapy (SBRT) with ablative doses has become the standard of care for patients with medically inoperable early-stage non-small cell lung cancer (NSCLC) [6]. Furthermore, the indications for SBRT have been expanded as an option for patients with early-stage, operable NSCLC [7,8] and lung metastases [9]. Despite advances in SBRT planning and delivery, achieving the desired prescription dose must be balanced against the risk of complications, with lung toxicity being the major limiting factor [2]. Radiation pneumonitis (RP) has been reported in numerous lung SBRT studies [10], with incidence of grade 2 or higher toxicity varying from approximately 10% [11,12] to as high as 30% [13,14].
Robust dose-volume predictors of RP are critical in guiding clinical practice and informing lung SBRT clinical trials. Various dosimetric parameters have been tested for correlation with RP, in particular mean lung dose (MLD) [15], or dose-volume cut-offs, Vx (volume of normal lung receiving at least dose 'x' Gy) [16]. Commonly, statistical analyses presented in the literature only test if the values of the parameter of choice were different among patients with RP vs without RP, or if a dichotomy can be achieved based on a cut-off value of the parameter of choice. While these observations provide useful data to help guide treatment planning, they do not describe the actual normal tissue dose-volume response. To date, only a few publications have reported correlation, or lack of, between the incidence of pulmonary toxicity and lung SBRT dose-volume parameters, e.g., MLD and V20 [10,17,18]. Numerous studies with a goal to explore incidence of RP as a function of various dose metrics have been published for conventional fractionation [19].
Use of Vx cut-offs is appealing because this metric can be readily taken from the dose-volume histogram (DVH). However, as was pointed out in the QUANTEC report on lung toxicity, Vx cut-offs may not be universal and the most predictive values of x may depend on the treatment technique [19]. Also, because of inter-correlation among Vx parameters [20], statistics do not allow to determine the preferred single predictor which connects with the etiology of toxicity in a meaningful manner. Parameters strongly correlated with such a predictor also become predictors. Hypothesis-generating studies are a pre-requisite to select the preferred metric to guide radiotherapy planning. In this paper, based on the authors' experience with the HyTEC effort, we present detailed dose-volume analyses of data on RP following lung SBRT from two discrete patient cohorts. We explored the incidence of RP as a function of various parameters, specifically normal lung dose-volume metrics and tumor-related metrics. In addition, we tested for correlation between such predictors in the two cohorts and an independently obtained dataset.

Patients
Two patient cohorts from previously reported studies were used, referenced below as cohorts A [21] and B [22]. These studies were selected because they provided detailed patient-specific data in a tabulated form. In brief, cohort A consisted of 66 (44 male, 22 female) patients with 71 primary (51) or metastatic (20) lung tumors, with planning target volume (PTV) ranging from 9.0 to 100.8 cc, who were treated with SBRT with prescribed doses of 40-60 Gy in 5-10 fractions over 5-12 days. Median age was 80 years (58-88 years). Prescription policies were based on the proportion of the prescribed dose covering 80% of the PTV; the median prescribed dose was 50 Gy in five fractions. Dose calculation was performed using pencil beam convolution with the Batho method used to account for tissue inhomogeneities. The authors analyzed clinical factors, specifically pulmonary emphysema, tumor location and subclinical interstitial lung disease (ILD) for association with RP (see discussed below). Cohort B consisted of 25 patients (21 male, 4 female) with primary (16), recurrent (2) or metastatic (7) lung tumors, PTV ranging from 7.5 to 239.4 cm 3 , who were treated with SBRT using a prescribed dose of 48 Gy in 4-8 fractions in 6-7 fields over 5-8 days. Median age was 77 years (73.8 ± years). Four patients presented with emphysema, three with ILD and one with chronic obstructive pulmonary disease (COPD). Planning goal was that 95% of PTV receives at least 90% of the prescribed dose. Collapsed cone convolution method was used for dose calculation. Detailed patient data have been extracted for further analysis. The normal lung parameters tabulated for cohort A were the % volumes receiving at least 5, 10, 20 and 30 Gy (V 5 ,V 10 , V 20 , V 30 ) and the MLD and for cohort B, the % volumes receiving at least 20, 40 and 45 Gy (V 20 , V 40 , V 45 ), absolute volume in cc receiving at least 24 and 48 Gy (V cc24 and V cc48 ), and MLD. Whenever possible, subsets of patients were selected to test for significance of a parameter.
To examine potential correlations between dose-volume parameters in an independent data set, 33 consecutive patients treated with lung SBRT at the University of California San Diego (UCSD) in 2017-2018 were selected. Planning data were extracted from the Eclipse treatment planning system (Varian, Palo Alto, CA). PTV prescription dose was 40-56 Gy in 3-5 fractions. Volumetric-modulated arc therapy (VMAT) with 2-3 coplanar arcs was used, with a planning goal that at least 95% of the planning target volume (PTV) receives 100% of the prescription dose. Analytical anisotropic algorithm, accounting for tissue inhomogeneities, was used to calculate the dose. Gated treatment was used for 12 patients. In planning 4DCT data were sorted into 10 phases, 0-90 with an increment of 10, where 0 corresponds to inhale, and 50 to exhale. Phases 30-70 or 40-60 centered around exhalation were used to minimize effect of tumor motion. The remaining 21 patients received Gate 100, which means all phases from inhale to exhale are used in planning and dose delivery, and patient is monitored during treatment for breathing irregularities, for example, deep inhale or cough. Normal lung volume was defined as bilateral lung minus internal target volume (ITV). MLD, V 5 , V 13 and V 20 -parameters commonly used in toxicity studies following SBRT [2] -were extracted. Institutional ethics board approval was obtained to extract, analyze and present the data.

Dose-volume response analysis
Reported dose-volume parameters for normal lung and tumor-related parameters were analyzed for correlation with incidence of RP grade 2 or higher (grade 2+) and grade 3 or higher (grade 3+) using the Common Terminology Criteria for Adverse Events v5.0 (CTCAE). Specifically, we analyzed MLD and reported V x for normal lung dose-volume parameters as well as gross tumor volume (GTV), ITV, and PTV for tumorrelated parameters. For each parameter, fitting to the logistic function was performed; for example, if V 10 was explored, the function was: where γ 50 is the normalized slope and V 10,50 is the value of V10 at which 50% of patients develop RP.
Maximum likelihood method was used to obtain model parameter values, γ 50 and V 10,50 . Likelihood-profile method was used to calculate 95% confidence intervals (CI) [23]. A likelihood ratio test was used to establish if the observed dependence significantly (p < 0.05) improves on the null hypothesis assuming no dependence. To calculate likelihood value for the null hypothesis incidence was set to the value averaged over the cohort. Spearman test was used to check for correlation between dose-volume parameters. All tests were two-tailed and p < 0.05 was considered statistically significant.

Results
Total number of patients in the study was 96, 13 of whom developed RP (13.5%). This sample size and toxicity rate was similar to those published in the studies that were summarized in the HyTEC report (see supplemental table in the HyTEC report [2]). Wilcoxon rank sum test showed that MLD was significantly different, p = 0.002, in patients with vs without grade 2+ toxicity, thereby justify further analysis for doseresponse. Table 1 shows the results of fitting the dose-volume data to the logistic function for both grade 2+ and 3+ toxicity in cohorts A and B. None of the tumor or target volume characteristics (GTV, ITV, PTV) showed statistical significance. Among normal lung parameters (MLD and reported V x ), only MLD consistently showed significant correlation with incidence of RP for both cohorts ( Figure 1). Specifically, MLD was statistically significant for both grade 2+ and grade 3+ in cohort A, as well as grade 2+ in cohort B. However, this significance was only marginal for grade 3+ in cohort B, p = 0.058.   Based on the available dose-volume data, values of V x were tested for correlation with incidence of toxicity.
Only V 5 and V 10 in cohort A resulted in a statistically significant V x -response dependence (  MLD data from cohorts A and B were combined and statistically significant dose-response was obtained for both grade 2+ and grade 3+ toxicity. Notably, when fitting was performed separately for cohorts A and B, the values of D 50 were very close, albeit with broad CI ( Table 1). Figure 3 shows dose-response curves and combined data. Model parameters are shown in the bottom row of Table 1. Gy where a 0.5 Gy size was used. Table 2 shows the correlations between the quantity in the "Parameter" column with normal lung MLD and V 20 ; the latter has been reported most broadly in the literature [10]. Spearman's rank-order correlation coefficients showed that both V 5 and V 10 strongly correlate with MLD in cohort A, Rs = 0.936 and 0.966, respectively. Overall, there was statistically significant correlation between all Vx as a percent volume and MLD, as well as between V x and V 20 . Correlation was weaker for tumor-specific parameters and insignificant for PTV D 95 , cohort B. The correlations were also tested in the UCSD independent validation cohort. Figure 4 shows strong correlation between V 5 , V 13 ,V 20 and MLD for the UCSD cohort with large and significant Spearman's rank-order correlation coefficients, (Rs ≥ 0.931, p < 0.001). Effect of possible confounders, specifically pulmonary emphysema, tumor location (upper/middle vs lower) and ILD was studied in the original report for the cohort A [21]. ILD was the only factor significantly associated with RP. In the cohort B [22] one patient had COPD, four had emphysema, tumor location was peripheral in 21 of 25 patients and peripheral/central in remaining four. This is insufficient to perform statistical analyses to assess effect of possible confounders.

Cohort A [21] Cohort B [22]
Parameter Incidence of radiation-induced lung toxicity as a function of MLD was separately assessed for patients with and without underlying interstitial lung disease (ILD

Discussion
Developing evidence-based dose-volume metrics to generate guidelines for normal tissue sparing is a challenging task assigned to working groups, such as QUANTEC [1], HyTEC [2,3] or PENTEC [5]. Selected metrics have to be justified on statistical and mechanistic merits and hypothesis-generating studies taking advantage of detailed patient-specific data provide this justification. Currently, both Vx and MLD are commonly used dosimetric correlates for lung toxicity following SBRT [2,19]. Supplementary Table 1 in the HyTEC report on lung toxicity following SBRT [2] shows 16 publications where lung toxicity correlated with either V 5 , V 10 , V 13 , V 20 or MLD, with V 20 and MLD being the most popular choices attempted in 15 publications.
In considering tumor-and target-specific (e.g., GTV, PTV) and normal lung-specific (e.g., MLD, V x ) metrics, our results showed that MLD was the only parameter that consistently correlated with incidence of RP across both cohorts A and B, as well as in the combined cohort. Proponents for MLD argue that regional lung function is compromised at any dose, with the underlying assumption that local injury is linearly dependent on dose [24]. Local dose-response for perfusion loss has been demonstrated in SBRT patients [25]. Studies of changes in regional lung perfusion as a function of conventional fractionated doses have shown that this dependence may follow a sigmoid relationship, with no loss of perfusion below 10-15 Gy and a plateau at about 55 Gy [24]. Although this deviates from the linear dependence theory, the authors deemed MLD as a reasonable surrogate for overall response parameter, which is a sum of regional injuries. This implies mechanistically-based use of MLD in SBRT and emphasizes the need to understand the local lung function (ventilation and perfusion) before SBRT. In theory, if there is a threshold dose beyond which function is 100% compromised, it will be lower for SBRT because using Radiation Biologically Effective Dose (BED) Calculator, 60Gy/30 has BED(3) of 100Gy, which is the same BED(3) of 31.9Gy/5, assuming α/β = 3Gy.
Those who advocate for Vx argue that lung regions are incapacitated after receiving above a certain threshold dose and that toxicity is avoided if a minimum total lung volume is undamaged. Unfortunately, data for conventional fractionation argues against the existence of a single, or even narrowly distributed threshold. Indeed, the QUANTEC report showed a broad range in threshold doses (V 5 to V 40 ) correlating to incidence of RP while also noting two important findings: 1) different V x parameters correlate with one another and MLD; 2) use of V x cut-offs has to be noticed as they may not be transferrable between radiotherapy techniques [19]. Supporting this first finding, we also saw significant correlation between V 5 , V 13 ,V 20 and MLD, p < 0.001, in our independent cohort. While this makes it impossible based on statistical analysis alone to single out the best predictor, it also means that, at least in a single institution setting, using MLD is as good as any other metric. Advanced statistical methods, for example Lasso or elastic nets regularized regression allow us to address the problem of co-linearity. This complex problem which requires large sets of detailed data has been gaining attention in the literature.
Robustness of MLD may be questioned because dose alone is insufficient to describe biological effect and accounting for dose-per-fraction may be needed, in particular for normal tissues. Notably, in this study physical dose without corrections for dose-per-fraction such as biologically equivalent dose in 2 Gy fractions (EQD2), was used for both the dose in V x and MLD. MLD was the strongest predictor despite a broad range in the number of fractions in the two cohorts. This provides the simplest way to guide treatment planning but also raises a question as to whether this is also a consequence of statistical correlation. Specifically, if MLD defined as physical dose and mean dose calculated following conversion into EQD2 are correlated, finding preferred metric will be challenging. In this study we were restricted to what was reported. Conversion of MLD to mean EQD2 is non-linear and cannot be performed without full DVH data. This is a limitation in our study and conversion to common fractionation, for example EQD2 is encouraged.
Variations of the volume cut-off approach, specifically absolute volume spared, have been put forward [26] and these may provide connection with a mechanistic nature of development of RP. For example, lung dosethresholds used in the RTOG 0915 trial required that less than 1500 and 1000 cc of total lung (minus PTV) receive a specified dose, to minimize risks of pulmonary function decline and pneumonitis, respectively. Similar thresholds were included in the American Association of Physicists in Medicine Task Group 101 report, which they describe as "minimum critical volume below threshold" [26]. Spared volume style constraints have been used for other parallel tissues, such as liver [3], though further investigation of their validity is needed.
Considering biological mechanisms, published studies propose various pro-inflammatory cytokines that contribute to the acute inflammation and late fibrosis seen in radiation pneumonitis. This inflammatory cascade has several well-recognized triggers including DNA double-strand breaks, endothelial cell damage, free radical damage, upregulation of leukocyte adhesion molecules and platelet derived growth factor (PDGF) isoforms, activation of type II pneumocytes, and hypoxia [27,28]. The existence of several cascade triggers supports the theory that inflammation follows a linearly dose dependent response rather than the incapacitation hypothesis underlying the V x approach. On this basis, use of MLD is supported as it fundamentally parallels the biological inflammatory dose-response.
Decline in pulmonary function after SBRT is a potential risk, though the extent to which this occurs is unclear. As some studies suggest no or minimal decline in pulmonary function, and tumor progression can be life threatening, baseline pulmonary functional status may not be a major factor in decision making with respect to SBRT dose-fractionation. With some studies suggesting a progressive decline in diffusion capacity, risk of pulmonary function decline should be discussed with the patient, though such changes can occur as part of the natural course of chronic obstructive pulmonary disease (COPD). In patients without COPD, diffusion capacity could potentially improve after SBRT. In contrast, patients with interstitial lung disease, regardless of measured pulmonary function, are at high risk of life-threatening toxicity after SBRT [2], and these risks must be weighed against potential benefits. Deferring treatment, or use of more stringent lung dosimetry should be considered.
Patient-and disease-specific parameters, for example age, sex, tumor location, performance status, may also play a role in development of RP [2]. Effect of age, sex, tumor location and performance status were investigated for correlation with RP in one of the source publications [21]. None of these factors were statistically significant with respect to predicting RP risks. Notably, another study [29] demonstrated that the incidence of RP can be minimized if patients are screened for radiographic evidence of ILD [15,22] and high values of biological markers (KL-6 and SP-D) before treatment. Increased treatment-related toxicity in patients with co-existing interstitial lung disease who receive SBRT has been noted in a comprehensive review [30]. Small sample size as well as relatively low toxicity rates were further limitations of our study. This, unfortunately, is a common problem with outcomes data reporting. Detailed patient-specific data are typically not available for large cohorts.

Conclusions
Our findings show that there is a strong statistical basis for using MLD as a predictor of RP within single institutions. These findings are also supported by mechanistic considerations. The relatively simple MLD metric has implications for clinical practice and lung SBRT clinical trials, especially in combination with immunotherapy (trials NCT03867175 and NCT03436056). While validation is required with data from different institutions, MLD is promising for use across institutions and in cooperative group trials. Limiting bilateral MLD <8 Gy was recommended in the HyTEC report for fractionated SBRT to ensure safety. Our study provides further justification for using MLD to predict RP risks. Competing Vx and critical volume (i.e., spared lung) approaches may have promise, but need further validation. Our main conclusion is simply that MLD may have merit for SBRT, and it should at least be considered along with the more common V x and critical volume metrics in future studies.

Additional Information Disclosures
Human subjects: All authors have confirmed that this study did not involve human participants or tissue. Animal subjects: All authors have confirmed that this study did not involve animal subjects or tissue.

Conflicts of interest:
In compliance with the ICMJE uniform disclosure form, all authors declare the following: Payment/services info: All authors have declared that no financial support was received from any organization for the submitted work. Financial relationships: All authors have declared that they have no financial relationships at present or within the previous three years with any organizations that might have an interest in the submitted work. Other relationships: All authors have declared that there are no other relationships or activities that could appear to have influenced the submitted work.