Current Available Computer-Aided Detection Catches Cancer but Requires a Human Operator

Introduction: This study intends to show that the current widely used computer-aided detection (CAD) may be helpful, but it is not an adequate replacement for the human input required to interpret mammograms accurately. However, this is not to discredit CAD’s ability but to further encourage the adoption of artificial intelligence-based algorithms into the toolset of radiologists. Methods: This study will use Hologic (Marlborough, MA, USA) and General Electric (Boston, MA, USA) CAD read images provided by patients found to be Breast Imaging Reporting and Data System (BI-RADS) 6 from 2019 to 2020. In addition, patient information will be pulled from our institution’s emergency medical record to confirm the findings seen in the pathologist report and the radiology read. Results: Data from a total of 24 female breast cancer patients from January 31st 2019 to April 31st 2020, was gathered from our institution’s emergency medical record with restrictions in patient numbers due to coronavirus disease 2019 (COVID-19). Within our patient population, CAD imaging was shown to be statistically significant in misidentifying breast cancer, while radiologist interpretation still proves to be the most effective tool. Conclusion: Despite a low sample size due to COVID-19, this study found that CAD did have significant difficulty in differentiating benign vs. malignant lesions. CAD should not be ignored, but it is not specific enough. Although CAD often marks cancer, it also marks several areas that are not cancer. CAD is currently best used as an additional tool for the radiologist.


Introduction
As society becomes more advanced, so does the technology that accompanies it; as software and the field of artificial intelligence and machine learning begin to evolve, so must society. The most significant point against automation's progress is the idea that it can replace the complex human role in the process [1,2]. In the field of diagnostic radiology there is a common distrust in Computer-Aided Detection (CAD) with the underlying reasons ranging anywhere from it being a somewhat recent addition to the field in the early 2000s to CAD's seemingly inability to consistently provide a benefit to the radiologist [3,4]. This study intends to illustrate that current CAD is not an adequate replacement for the human input required to interpret mammograms accurately. This study intends to decrease the current anxiety that both current and future radiologists may have regarding the field's security. Among women in the United States, breast cancer remains the second most diagnosed type of cancer after skin cancer, with this statistic holding when extrapolated to the worldwide population [5][6][7][8]. As such, one cannot neglect the importance of early detection through routine mammography screening, and the early detection of breast cancer is associated with better outcomes [5,6]. However, with each mammogram taken, there must be a radiologist that not only reads it but does so with both high accuracy and at a high volume. CAD-based mammography systems have seen extensive use in the United States following their introduction in 2001; however, since its introduction, CAD has received mixed approval from radiologists regarding its performance [3,4]. One of the more common thoughts is that CAD itself may adversely affect radiologist performance and increase the number of unnecessary biopsies while providing no significant improvement in cancer detection rates [3][4][5][6][7][8][9]. However, while CAD has seen noticeable improvements in accuracy, it could be argued that it remains in its infancy 20 years after its introduction. With that in mind, should CAD be partnered with a radiologist to emulate a double read, the overall theoretical cost could remain low [1,10,11]. This would allow an increase in volume to be adopted by the overseeing radiologist [1,10,11]. If efficiently implemented into the healthcare system, this increase in volume could accelerate the decrease in breast cancer mortality by expanding access to high-quality preventive care at a fraction of the cost [5,6,8,12,13]. During the extent of this study, patient information and data on the diagnosis was gathered post patient visit. Inclusion criteria for patient eligibility were that the patient had to be over the age of 18 and had received either craniocaudal (CC) or mediolateral oblique (MLO) mammography during their visit. The patient must have been categorized under Breast Imaging, Reporting & Data System (BI-RADS) as category 6, biopsy-proven malignancy, between January 1st, 2019, and December 31st, 2020. Patient's data was not eligible for analysis in the study if they lacked proper mammogram/breast imaging, if there was an inability to access CAD data for the patient's imaging, or if they received a pathology report negative for breast cancer (a negative report being classified as BI-RADS category 1 or 2). If the patient met all inclusion criteria and none of the exclusion criteria, their first name, last name, and medical records number (MRN) were recorded within a master data sheet for safekeeping. University of Texas Medical Branch Institutional Review Board issued approval IRB #20-0018.

Materials And Methods
The number of patients included in the study depended mainly on the volume seen between January 1st, 2019 and December 31st, 2020; however, a sample size between 50 and 100 was predicted based on past trends. Due to the time-limited storage of the CAD data within the institution's system, only the most recent CAD reads were used throughout this study. Mammogram imaging and CAD reads were obtained and viewed via the Hologic "SecurView® DX" workstation (Hologic, Marlborough, MA, USA) and the included proprietary software. Because of limitations to the CAD data storage, the CAD annotated images of the most recent CAD reads were exported to a password-encrypted disk drive that remained locked within the institution's Department of Radiology. Images were used to record CAD data and were deidentified and disposed of upon completion of the study. Once the CAD reads had been saved, a tally of all CAD markers was recorded onto a master datasheet.
Variables that were recorded onto the master datasheet include patient's name, MRN, patient identifier code, CAD software used (Hologic or General Electric [Boston, MA, USA]), left and right CC CAD marks, left and right MLO view CAD marks, what was detected in CC CAD, what was detected in MLO CAD, breast density (almost entirely fat, scattered fibroglandular, heterogeneously dense, and extremely dense), radiologic findings of biopsy-proven cancer (mass, calcification, or asymmetry), and the cancer type by pathology report. Each CC and MLO view had their CAD data in terms of the amount of "CAD markings" recorded. The columns "Cancer CC Detected" and "Cancer MLO Detected" were both be used to determine whether or not the left and right breast were found to have one of the five findings, Cancer Hit, Cancer Missed, Cancer Missed w/ Benign Findings, Benign Findings, or No Findings. The two instances of CAD used at the institution included the Hologic and General Electronic's iteration of the CAD algorithm. The Hologic CAD software markings asterisk, triangle, and cross represent masses, calcification clusters, and a combination mass + calcifications cluster, respectively. Additionally, the General Electric CAD software marked a circle and square representing a mass and calcifications cluster, respectively. Breast density and the final pathology report findings were extracted from the institution's Epic EMR. Once data gathering has concluded, all variables recorded aside from the patient name and MRN were transferred to a separate data analysis datasheet for statistical analysis. A unique study identifier was used in the data analysis datasheet to help ensure patient privacy and confidentiality.

Results
Data from 24 female breast cancer patients from January 31st, 2019 to April 31st, 2020, was gathered from the institution's Epic emergency medical record. While the original protocol called for this study to continue for the entire 2020 year, it was decided to end the study prematurely due to the COVID-19 pandemic that began in the spring of 2020.
Of the 24 patients who met inclusion criteria and underwent CAD imaging, radiologists were able to detect all cancer, and CAD was able to detect most of the cancer. CAD missed the detection of cancer in one of the two views (CC and MLO) for three patients. These data are shown in Table 1, including pathology findings and markings depicted from CAD for each of the patients. The type of breast density also plays a big part in CAD readings. It was found that CAD tends to have a hard time differentiating between malignant or benign in dense breast tissue. In dense breasts (heterogeneously dense or extremely dense), it was found that CAD had an average total marking of 5.33 compared to the fatty breast (fatty and scattered fibroglandular densities) of 4.73, shown in Table 2. An example of dense breasts is shown in Figure 1, and an example of fatty breasts is shown in Figure 2.

Discussion
Artificial intelligence-based algorithms that incorporate machine learning and deep learning are drawing a lot of interest amongst radiologists and the public. However, at this point, those algorithms are not currently widely adopted by breast imaging centers across the nation. The current widely used CAD technology is still too unrefined at the moment. Artificial intelligence-based algorithms improving upon the current CAD would require further research in addition to further investigations into how its implementation in a clinical setting affects patient care. McKinney et al. support this by noting that clinical trials are needed for AI; however, they believe that AI has a role in the future in aiding early detection of breast cancer once refined [1,7,14,15]. McKinney et al.'s [15] study illustrated that CAD is capable of excellent results in controlled testing, but the opposite may occur in real-world scenarios due to various factors including, but not limited to, the radiologists ignoring or misusing CAD due to the high frequency of marks detected that showed no signs of cancer [3,4,7,14] which result in a large number of callbacks [7,14].
Regarding CAD's effect on radiologist's reads, Du-Crow et al. [1] showed that observer sensitivity was significantly greater for markers detected by CAD compared to the same markers without CAD (t51=6.56, p<0.001); with no significant difference between no-CAD and CAD conditions on markers that were not detected by CAD. This shows that CAD itself may play a significant role in augmenting the radiologist's ability as a tool. The data shows that while CAD's algorithm can detect a potential instance of breast cancer, CAD is not currently able to differentiate between benign and malignant cases of breast cancer. Due to this lack of differentiation, CAD detects many benign findings that untrained eye could be falsely reported as cancer requiring further workup. In working up these cases, unnecessary harm and financial burden may be placed upon the patient through an increase in unnecessary biopsies, and all the additional imaging and procedural cost needed to confirm the findings are non-cancerous [9]. Winch et al. [9] concluded that false positives leading to unnecessary biopsy and further testing could cause patients to worry and develop a fear of cancer until the results are returned and discussed. In addition to this, the legal implication and the liability holder of having a CAD-only system in a hospital is a topic that has not been discussed and would not have a clear-cut answer in the policy.
Overall, due to the relative infancy of CAD, the diversity of the real-world setting, in addition to the program's inexperience with these demographics, has allowed conflicting information regarding the accuracy of CAD to be published. For example, Lehman et al. [3,4] demonstrate that there was no improvement in the diagnostic accuracy of CAD from a sample of extensive data from the US Breast Cancer Surveillance Consortium registry of clinical mammography interpretations. However, another study on digital breast tomosynthesis CAD system found that detected a good amount of breast cancers that presented via masses or microcalcification clusters (89%, 99 of 111) with an acceptable false-positive rate.
While it may be impossible to truly understand whether these conflicting views on CAD are caused by varying patient population or differing technology, further studies are certainly needed to explore the interactions of radiologists and CAD systems [16].

Conclusions
Technology will continue to advance, and artificial intelligence will improve. However, machines replacing humans is unlikely because studies similar to this one will likely continue to show that human plus machine is greater than machine alone. Thus, it is more likely that the technology will add a tool aiding radiologist interpretation rather than replace the radiologist.
Study limitations include factors such as our sample size, which may decrease the significance and generalizability of our results, limitations in the number of CAD data retrieved, differences in detection algorithms between different CAD databases, and the potential decrease of generalizability due to differences in the patient population. Of note, this study had to end prematurely due to the impact of COVID-19 on elective procedures during the Spring, Summer, and Fall of 2020. This limited the number of our patients in the study, which could have provided more support to the significant findings we found with our 24 patients. It may be challenging to be able to compare the various CAD systems used by most institutions accurately; however, it may be possible to alleviate the study limitations of sample size, patient generalizability, and CAD generalizability by creating a multi-institution partnership designed around comparing the CAD systems across the varying institutions. A study on this scale may be difficult to organize due to a lack of logistics and the potential differences the different CAD systems may have in reporting results.

Additional Information Disclosures
Human subjects: Consent was obtained by all participants in this study. UTMB Institutional Review Board issued approval IRB # 20-0018. The UTMB Institutional Review Board (IRB) reviewed the above-referenced research project and determined this request met the criteria for exemption from review by the IRB in accordance with the 45 CFR 46.104. This determination was made on 02-Mar-2020. Further review of this project by the IRB is not required unless the protocol changes in the use of human subjects. In that case, the project must be resubmitted to the IRB for review. Please inform the IRB when this research project is completed. If you have any questions, please do not hesitate to contact the IRB office via email at IRB@utmb.edu. Animal subjects: All authors have confirmed that this study did not involve animal subjects or tissue. Conflicts of interest: In compliance with the ICMJE uniform disclosure form, all authors declare the following: Payment/services info: All authors have declared that no financial support was received from any organization for the submitted work. Financial relationships: All authors have declared that they have no financial relationships at present or within the previous three years with any organizations that might have an interest in the submitted work. Other relationships: All authors have declared that there are no other relationships or activities that could appear to have influenced the submitted work.