Evaluation of Automated Treatment Planning and Organ Dose Prediction for Lung Stereotactic Body Radiotherapy

Purposes: To evaluate whether the auto-planning (AP) module can achieve clinically acceptable treatment plans for lung stereotactic body radiotherapy (SBRT) and to evaluate the effectiveness of a dose prediction model. Methods: Twenty lung SBRT cases planned manually with 50 Gy in five fractions were replanned using the Pinnacle (Philips Radiation Oncology Systems, Fitchburg, WI) AP module according to the dose constraint tables from the Radiation Therapy Oncology Group (RTOG) 0813 protocol. Doses to the organs at risk (OAR) were compared between the manual and AP plans. Using a dose prediction model from a commercial product, PlanIQ (Sun Nuclear Corporation, Melbourne, FL), we also compared OAR doses from AP plans with predicted doses. Results: All manual and AP plans achieved clinically required dose coverage to the target volumes. The AP plans achieved equal or better OAR sparing when compared to the manual plans, most noticeable in the maximum doses of the spinal cord, ipsilateral brachial plexus, esophagus, and trachea. Predicted doses to the heart, esophagus, and trachea were highly correlated with the doses of these OARs from the AP plans with the highest correlation coefficient of 0.911, 0.823, and 0.803, respectively. Conclusion: Auto-planning for lung SBRT improved OAR sparing while keeping the same dose coverage to the tumor. The dose prediction model can provide useful planning dose guidance.


Introduction
Beyond accuracy in dose calculation, computer-aided optimization and automation aim to improve planning efficiency, consistency, and quality. Tools such as auto-planning, RapidPlan (Varian Medical Systems, Palo Alto, CA), and multi-criteria optimization (MCO) are developed by different vendors and implemented for clinical use. Large variations in plan quality are observed in multi-institutional studies [1], which imply that plan quality may be limited by planners' experience and expertise. With increased automation in treatment planning, the resultant plan quality is less dependent on the user experience, while the planning efficiency and consistency can be improved.
The utilization of lung stereotactic body radiotherapy (SBRT) has increased steadily over the past two decades [11][12][13][14][15][16][17]. SBRT with a minimum biological equivalent dose (BED) of 100 Gy has been shown to be safe and effective at curing stage I non-small-cell lung cancer in medically inoperable patients [15]. At such high BED and fractional doses, the standardization of treatment planning is critical to the treatment quality. To maintain consistent plan quality, many institutions have implemented a peer-review process for SBRT plans, adopting a dosimetry audit process used in multicenter clinical trials [18]. The challenge at the time of a plan quality peer review or audit is that it is only as good as the dosimetric metrics that are established for a specific disease presentation and is not patient-specific. Knowledge-based models and artificial intelligence (AI)-based predictions are being introduced to address this challenge, but practical solutions are not readily available for general planners and medical physicists. In planning radiotherapy for some cancers, for example, in the head and neck, the number of OARs also makes knowledge or AI-based predictions challenging due to the possible trade-off among these OARs. When planning lung SBRT cases, however, the number of OARs is relatively modest and the relationship between them is less complex. This makes consideration of potential OAR trade-offs simpler. For these reasons, planning automation and accurate plan quality prediction may be easier to achieve. The present study seeks to evaluate the AP module in planning for lung SBRT and to compare its dosimetric results to the plan predictions from a commercially available product. These dose predictions potentially will serve two purposes: (a) to provide input objectives to the AP module and (b) to validate the quality of AP plans.

Materials And Methods
Twenty patients treated with definitive, manually-planned intensity-modulated radiation therapy (IMRT) or volumetric modulated arc therapy (VMAT) for lung SBRT at our institution from 2014 to 2015 were randomly selected and replanned with the AP module without manual adjusting. The AP module mimics the manual planning process, separates overlapped contours, creates tuning structures, adjusts hot and cold spots, and optimizes the plan iteratively [8]. Both the original manual plans and AP replans were prescribed with 50 Gy in five fractions according to the Radiation Therapy Oncology Group (RTOG) 0813 protocol and planned using the Pinnacle treatment planning system (version 9.10, Philips Radiation Oncology Systems, Fitchburg, WI). A set of AP parameters were determined based on the RTOG 0813 dose constraints and phantom testing. The AP parameter settings and the planning objectives for the central and peripheral tumors are listed in Tables 1, 2. Among the planning structures, "Ring" was a 4-cm expansion of the planned target volume (PTV) minus the 2-cm expansion. The AP replans without further manual adjustment were qualitatively judged by a physician based on the conformality, sharpness of dose fall-off (isodose lines at 2 cm beyond the edge of the PTV), and verification using RTOG 0813 constraints ( Table 3). Time from initiating the AP process to arrive at an acceptable plan was recorded.

Max iterations 50
Engine type Biological     The PlanIQ (Sun Nuclear Corporation, Melbourne, FL) is a commercial product that predicts possible OAR sparing or feasibility. The targets are assumed to have 100% of uniform prescription dose coverage, which is not used for dose coverage prediction and appears clinically impossible. The OAR sparing prediction and the dose fall-off outside the targets are calculated using the heterogeneous patient dataset, taking into account the high (penumbra-driven) and low (percent depth dose [PDD] and scatter-driven) gradient dose spreading [19]. The PlanIQ predictions are assigned with f-values (feasibility factor), which indicate the feasibility in achieving the predicted OAR sparing. The f-value ranges from 0 (unachievable) to 1 (easily achievable). The AP dosimetric endpoints were compared with the predicted values from PlanIQ, and correlations between the AP doses and predictions were tested.
The Mann-Whitney U test [20] was used to compare the AP planning time between central and peripheral tumors and between VMAT and IMRT techniques. The Wilcoxon signed-rank test [21] was used to compare the PTV coverage and OAR sparing for each pair of manual and AP plans. Spearman's rank correlation [22] was used to describe the correlation between the PlanIQ predictions and AP plans.
Of the 20 patients, 10 had central and the other 10 had peripheral tumors. The median tumor size was 3.55 cm (range: 0.9-6.8 cm, Table 4). The median time for the AP treatment planning process was 17 minutes per plan (range: 10-40). As shown in Figure 1, the median time for AP for central vs. peripheral tumors was 20 vs. 15 minutes (p = 0.0521), and for IMRT vs. VMAT was 15 vs. 20 minutes (p = 0.0185). The quality of AP vs. manual plans was "better" in 15%, "equally acceptable" in 80%, and "worse" in 5% per physician judgment based on the target coverage, OAR sparing, and three-dimensional isodose distributions.   P-values are calculated using the Mann-Whitney U test.
VMAT: volumetric modulated arc therapy; IMRT: intensity-modulated radiation therapy; AP: auto-planning. Figure 2 depicts the dosimetric comparison between the AP and manual plans. All AP and manual plans achieved clinically required target coverage; at least 95% of the PTV received 100% of the prescription dose.
The median values for the variables compared between manual and AP plans are listed in Table 5.      The dose endpoints for OARs were compared between the AP plans and PlanIQ predictions with the feasibility factor, f, set to 0, 0.1, and 0.5, respectively. Figure 5 shows the PlanIQ predictions plotted against the AP parameters with a reference line y = x. The reference line represents the ideal prediction. The correlation between the PlanIQ and AP dosimetric parameters was tested using the Spearman rank-order correlation ( Table 6). The spinal cord Dmax had a moderate correlation, while other OARs had high correlations.

Discussion
The AP treatment planning time for IMRT was significantly shorter than that for VMAT, which is similar to our experience with manual planning.
Step-and-shoot IMRT plans typically have fewer control points thus shorter calculation time compared to that of VMAT plans, and gantry speed is not a constraint for IMRT optimization. While the difference was not significant, the AP treatment planning time was shorter for peripheral targets than that for central targets, which appears intuitive as central targets are closer to more OARs. AP treatment planning time also depended on other factors, such as tumor size, computational power, dose grid size, and resolution. While the total planning time was not compared between manual and AP plans, the AP plan quality achieved clinical acceptance with minimal human intervention, thus saving planners' time and improving efficiency. Creemers et al. have noted that AP reduces the planners' "handson time" by 75% [3]. In clinical treatment planning, efficiency and quality are not independent. Planners often work on multiple plans simultaneously with given deadlines. After meeting clinical acceptance criteria, further manual optimization may not be feasible due to the limit in time and resources. By reducing the "hands-on time," better plan quality may also be achieved.
A recent study [23] by Lu et al. compared plan quality for four sites using three different advanced planning tools including AP. They showed that AP could improve plan quality, but the statistical power was limited by the small sample size-five patients for each site. Our study showed that the AP plans maintained the PTV coverage and significantly improved CI. Doses were also reduced in AP plans for all seven OARs, and four of the seven reached statistical significance (p < 0.05).
While AP mimics planners to progressively optimize IMRT and VMAT plans, the process is not closed-loop automation. The PlanIQ is designed to predict feasible DVHs to help guide the AP set-up; it also provides an initial plan quality check when the AP process finishes. Unlike knowledge-based planning, the PlanIQ uses a geometric relationship between the target and OARs and calculates the feasible DVHs without any dependence on prior treatment planning knowledge. It avoids the potential propagation of skewed data. In this work, the AP results and PlanIQ predictions had strong correlations for six of seven OARs, which indicated that PlanIQ could be used as an AP plan quality checker. Other studies have also shown that using PlanIQ predictions as planning guidance could improve plan quality [24].

Conclusions
Auto-planning in lung SBRT improved OAR sparing while keeping the same dose coverage to the tumor. Of tested AP replans, 95% were at least equally acceptable compared to the manual plans. The OAR dose predictions correlated strongly with the AP dosimetric endpoints on D max of the heart, esophagus, trachea, PBT, and ipsilateral brachial plexus, as well as the whole lung V 20Gy . AP is a reliable strategy to improve lung SBRT planning quality and efficiency, and the prediction tool may offer additional automation and quality assurance.