Demographics and Outcomes of Spine Surgery in Octogenarians and Nonagenarians: A Comparison of the National Inpatient Sample, MarketScan and National Surgical Quality Improvement Program Databases

Introduction Despite the increasing use of national databases to conduct spine research, questions remain regarding their study validity and consistency. This study tested for similarity and inter-database reliability in reported measures between three commonly used national databases. Methods International Classification of Diseases, 9th edition (ICD-9) codes were used to identify elderly (80-100 years) who underwent spine surgery patients in Truven Health Analytics MarketScan® claims database, National (Nationwide) Inpatient Sample (NIS) discharge database and National Surgical Quality Improvement Program (NSQIP) database (2006-2016). Patient baseline characteristics, comorbid status, insurance enrollment, and outcomes were queried and compared. Results We analyzed 15,105 MarketScan, 40,854 NIS, and 7682 NSQIP patients between ages 80 to 100 years (median, 82 years) who underwent spine surgeries during the study period. A majority of patients in both MarketScan and NIS were insured by Medicare (97% vs. 94%). Patients in MarketScan had lower comorbidity scores (comorbidity, 0-2) compared to those in NIS and NSQIP databases. The most common diagnosis was spinal stenosis in MarketScan (54.4%), NIS (54.6%), and NSQIP databases (65.2%). Fusion was the most common procedure performed in MarketScan (48.9%) and NIS databases (46.2%), whereas decompression (laminectomy/laminotomy) was the most common procedure in the NSQIP database (51.84%). In-hospital complications (any) were 6.5% in the MarketScan cohort, 5.3% in the NIS, and 2.02% in the NSQIP cohort. In terms of 30-day complications (any), the MarketScan database reported higher complications rate (12.7%) compared to the NSQIP database (5.08%). In-hospital mortality was slightly higher in the NIS database (0.32%) compared to MarketScan (0.21%) and NSQIP database (0.2%). MarketScan and NIS databases showed an increased risk of complications with increasing age, whereas NIS and NSQIP showed increasing complications with a higher number of comorbidities. Male gender had higher complication at 30-day post-discharge using MarketScan and NSQIP database. Conclusions Patients in the NSQIP and NIS database have more comorbidities; patients in the MarketScan database had the highest number of perioperative and 30-day post-discharge complications with the highest number of fusion procedures performed. Patients in the NSQIP database had the lowest number of fusion procedures and complication rates. As databases gain popularity in spine surgery, clinicians and reviewers should be cautious in generalizing results to whole populations and pay close attention to the population being represented by the data from which the statistical significance was derived.


Introduction
Researchers must cross a multitude of barriers to document a sufficiently large cohort to study rare diseases and procedures [1]. National databases allow expedited investigation of widespread trends and demographics for clinical interpretation [1][2][3][4]. Retrospective analysis of focused cohorts provides clinicians with opportunities to understand their patient population comprehensively and implement care delivery strategies to improve outcomes. This is a crucial step in preventing medical errors and improving the quality of care [1,5]. As database studies gain traction among researchers; it is essential to ensure external validity by understanding key characteristics and composition of the databases, which is made difficult due to limited granularity of clinical circumstances, before confidently generalizing the results to the clinical population [6].
One such focused population that remains difficult to be studied prospectively is the elderly who undergo spine procedures. Wary surgeons refrain from operating on this population due to the prevalence of comorbidities and frailty [7,8]. However, certain advances, including minimally invasive surgery and enhanced recovery after surgery (ERAS), have drastically improved the procedural outcomes. A possible solution is to retrospectively extract outcomes from databases to assess the viability of spine surgery in the elderly population. The present study reports the differences in three commonly used databases, MarketScan, National Inpatient Sample (NIS), and National Surgical Quality Improvement Program (NSQIP) in regards to patient demographics, complications, and outcomes following spine surgery in octogenarians and nonagenarians.

Data sources
The Truven Health Analytics MarketScan® claims database collects participant information from Commercial Claims and Encounters, Medicare Supplemental and Coordination of Benefits and Medicaid databases. Insurance enrollment, inpatient and outpatient utility, and claims and costs are provided and organized based on 150 payers in the US from employer-based plans [9]. A neurology/neurosurgery customdataset obtained from MarketScan spanning from 2000-2012 was used. Medicare in MarketScan is Medicare Supplemental (also called Medigap). These patients are those on Medicare who can afford to take supplemental insurance to cover some things that Medicare doesn't cover.
The Healthcare Costs and Utilization Project (HCUP) NIS is the largest all-payer inpatient database that collates discharge patient information on all inpatient admissions in non-federal US hospitals. A stratified random sampling technique of the hospitals and patients produces a representative 20% subsample, which can be generalized to the American medical community [10]. The Elixhauser comorbidity data was implemented to NIS in 1998, which allows for calculating risk adjustments through the database [11]. We extracted a custom dataset spanning from 2000 to 2012 [10]. The NIS data for this study was adapted from Drazin et al. with permission [12].
National Surgical Quality Improvement Program (NSQIP) is a well-recognized nationally validated outcomebased database introduced by the American College of Surgeons (ACS) to improve the quality of surgical care. Data is extracted using the International Classification of Diseases (ICD) 9/10 and Current Procedural Terminology (CPT) codes using this database and include comorbidities and postoperative outcomes.

Data extraction
The study population was composed of a retrospective cohort study of patients undergoing spine surgery procedures for spinal stenosis in the Truven Health Analytics MarketScan® database, NIS database, and NSQIP database from 2006-2016. Patient extraction was performed using the International Classification of Diseases, 9th edition (ICD-9) coding system (for all databases), and the Current Procedural Terminology, 4th edition (CPT-4) (for MarketScan only). MarketScan is a longitudinal database. For this study, the first occurring hospitalization, satisfying the extraction conditions, was used for patient characteristics and most outcomes. Patient baseline characteristics included: age, gender, comorbid status, insurance type, and primary procedure. Outcome measures included: in-hospital complication and mortality risks, length of stay (LOS), and stratified in-hospital complication risks. Multivariable analysis assessed the association of baseline and patient characteristics with perioperative complications.

Statistical analysis
Patient characteristics were summarized using means and standard deviation (for continuous variables) and counts and percentage (for categorical variables). Differences were considered significant if p<0.0001. Each outcome (mortality, complications, and length of stay), within each database, was analyzed in a multivariable analysis including four variables (age at diagnosis, gender, comorbid state, and procedural type). Results were presented in terms of odds ratio (OR) or relative risk (RR) with associated 95% confidence interval.

Results
A total of 63,641 octogenarians and nonagenarians who underwent spinal decompression, discectomy, or fusion surgery for spinal stenosis were identified from all the databases. The baseline patient characteristics and procedure outcomes were compared between the 15,105 MarketScan, 40,854 NIS, and 7682 NSQIP patients. Calculated odds-ratio of experiencing a perioperative complication during index hospitalization is presented in Table 1.

Length of hospital stay, complications, and mortality
The median length of hospital stay was similar across the cohorts (3 days) with IQR of 2-4 days in MarketScan, 2-5 days in NIS, and 1-4 days in NSQIP database. In-hospital complications (any) were 6.5% in the MarketScan cohort, 5.5% in the NIS cohort, and 2% in the NSQIP cohort, with the most common being acute renal injury followed by pneumonia in both MarketScan and NIS database. Whereas pneumonia followed by deep vein thrombosis (DVT) were common complications in NSQIP database. In terms of 30-day complications (any), MarketScan database reported higher complications rate (12.7%) compared to NSQIP database (5.08%) and pneumonia (3.53%) was the most common complication in MarketScan database, whereas surgical site infection (1.58%) was the most common in NSQIP database. In-hospital mortality was slightly higher in the NIS database (0.32%) compared to MarketScan (0.21%) and NSQIP database (0.2%), Table 5.

Discussion
Incorporation of national databases into research has substantially increased in the past few years [2,3,6,13,14]. Although these large sample sizes offer researchers opportunities to investigate rare diseases, the statistically significant results are nevertheless susceptible to type I errors, or false-positive results [1]. Therefore, it is essential to understand the observational and retrospective nature of the database and its sample populations prior to generalizing its outcomes to the total population when given statistically significant results. To our knowledge, this study is the first to compare outcomes and demographics of elderly patients undergoing spine surgery in three commonly used databases.

Demographics and outcomes
In comparing MarketScan, NIS, and NSQIP databases, our study found several differences between the cohorts. Compared to the NIS and NSQIP cohort, the MarketScan cohort was healthier, possibly owing to a larger group of participants with fewer documented comorbidities. Although nearly half the patients' primary procedure was decompression in all three databases, a slightly larger proportion of the MarketScan cohort underwent fusion compared to the NIS and NSQIP cohort. The outcomes of the database reflect the cohort composition. The mortality rates between the sampled populations were not significantly different, which could be due to the overall low mortality rates of spine surgery. In addition, higher rates of complications have been associated with patients undergoing fusion surgery, especially in the elderly, and with those affected by a higher number of comorbidities [15][16][17][18][19][20]. This could explain why fusion surgery was performed more frequently in the healthier MarketScan population, compared to the sicker NIS population with a higher number of comorbidities.

Differences in the national administrative databases
Non-uniform methodology of these databases can uncover difficulties in generalizing results and thus drawing clinical significance. Crucial differences can arise from each database's sampling methods. Truven Health Analytics MarketScan® database compiles its samples from claims of employees, Medicare-eligible retirees, early retired, Consolidated Omnibus Budget Reconciliation Act (COBRA) participants and their dependents enrolled through large US corporations in the private sector [9,21]. In contrast, HCUP NIS collects a stratified systematic sample from all HCUP hospitals, which is equivalent to 20% of all discharges from community hospitals in the United States [10,14]. Based on the method used to collect the cohort sample, NIS is most likely representative of national means and the US population. However, NIS contains information related to hospital discharges only. MarketScan readily offers outpatient visit information, allowing for better understanding in longitudinal aspects for investigation. Since MarketScan collates participants from those insured by large US corporations, their sample may be limited to specific geographic or socioeconomic groups [21]. It can be argued that because MarketScan databases cover participants who were insured through large US corporations, they may not be as representative or comparable of the general US population. Whereas, NSQIP is a nationally validated program forwarded by the American College of Surgeons (ACS) aimed to improve the quality of surgical care by providing tools to participating hospitals.
Overall, while it is not surprising to report that advanced aged participants are predominantly enrolled in Medicare, discerning discrepant trends allows patients to choose clinically and economically sound providers to anticipate healthcare costs. An arsenal of comprehensive variables is necessary to streamline patients' experiences and outcomes [22]. Due to its limited collation of participant data from only US corporations, MarketScan is theoretically unable to present a cohort that is characteristic of the whole US population. Nonetheless, studies examining the quality of NIS data found discrepancies when comparing results derived from patient charts and administrative data from ICD-9 billing codes [1,2,23]. Furthermore, billing-codes are variable on the interpretation and accuracy of the operator (trained vs. naïve) as well as external political and economic pressures leading to variability in application of different codes for a similar procedure in different databases [1].
Since these databases have numerous overlapping variables, and no single database contains all variables, multiple database approach may help compensate for their respective weaknesses. Buckland et al. showed that national databases such as NIS and NSQIP did not capture a similar patient population when compared to physician managed database (PMD) in patients underusing surgery for adult spinal deformity [24]. This difference can be attributed to the referral pattern and selection bias in the PMD cohort. Similarly, Bohl et al. showed that NIS and NSQIP databases gave different results (complications and comorbidities) in patients with hip fractures [25]. In concordance to these studies, we found that 30-day post-discharge complications varied significantly between MarketScan (12.77%) and NSQIP database (5.1%).
According to comorbidity scores alone, NIS and NSQIP patients were less healthy than their MarketScan counterparts. In our study, we used Elixhauser comorbidity index for analysis in all three databases. Nonetheless, it is integral to question the comorbidity indices implemented for the analysis, as not all comorbidities are weighted equally among each index. The algorithm of Elixhauser comorbidity index was developed to predict the inpatient outcomes in hospitalized patients based on their acute and chronic conditions [11,26]. It has been demonstrated to predict the in-hospital mortality with respect to disease burden, especially after 30-days of hospitalization [27]. In contrast, the Charlson comorbidity index was designed to predict one-year mortality based on a patient's comorbidities [28]. While both calculations are commonly utilized to discriminate for future mortality outcomes, Menendez et al. reported that the Elixhauser comorbidity method outperformed Charlson Index in regards to predicting inpatient outcomes after specifically orthopedic surgery [29]. Thus, inclusion and exclusion criteria for pertinent variables of candidate databases should be deliberated to identify the optimal database fitting study aims.

Differences yet similarity among databases
It is important, however, to note that despite vastly different sample sizes, demographics, and collection methods, the primary and secondary results from the databases are not different. The large cohort sizes provide a means to obtain statistical significance that highlight minor differences, but these differences may not be clinically relevant. Additionally, not infrequently, clinicians afford too much attention to p-values, forgetting to vet the generalizability. Although minor differences are highlighted due to the power of the study, broadly, the results of these databases are moderately consistent with one another, suggesting precise results despite differing acquisition methods. Nonetheless, we caution clinicians from generalizing results of database studies. Although they theoretically should represent the population of the country through their sampling methods, generalizing this data to the total population may not be accurate due to the retrospective and observational nature of database studies, especially considering changing practices and advancing minimally invasive technologies. While the owners of the database may promise internal validity, we must be wary of assigning external validity to the total patient population.

Limitations and strengths
This study has several limitations. First, the accuracy of our results depends largely on the accuracy and consistency of the reported diagnosis and procedure codes. Secondly, the inability to match patients between these three databases limits our capability to reason several of the discovered outcomediscrepancies. Specific patient profiles would allow analysis regarding adherence to evidence-based medicine and hospital guidelines, especially in standards with the geographical location [30]. One such finding includes the differences in stratified post-operative complications between the three databases.
Although the most common specific complications were alike in the three databases, it is difficult to ascertain the discrepancies without additional granular clinical details.
Notably, MarketScan and NSQIP can track patient data after the perioperative period. In contrast, NIS was limited to information accumulated during the immediate inpatient stay, thereby disallowing longitudinal comparison to determine superiority in that regard. Moreover, because both NIS and MarketScan were not designed to collect spine-or orthopedic-specific data, this study was limited to available variables. Reported improvements in the quality of life and activities of daily living following procedures would provide integral insight into necessary changes required to expand care delivery outcomes. As all databases offer different groups of patient characteristics and widely differ in their sample collection, we remain cognizant of the limitation in the generalizability of the comparison of results and databases.

Conclusions
Even though the results of the three commonly used databases were not completely different, suggesting some consistency despite differing sampling methods, this study captures the discrepancies in the demographics of spine surgery. The disparities drive the variations observed in preoperative comorbid status and inpatient and long-term adverse events. Overall, it appears that the patients in the NSQIP and NIS database have more comorbidities, patients in the MarketScan database had the highest number of perioperative and 30-day post-discharge complications with the highest number of fusion procedures performed. Patients in the NSQIP database had the lowest number of fusion procedures and complication rates. Thus, researchers should be wary of generalizing results from sample populations onto total populations with retrospective, observational database study designs. Future studies may additionally benefit from different database approaches to supplement any vulnerabilities of the primary database.

National Inpatient Sample and National Surgical Quality Improvement Program Databases
International Classification of Diseases character 1 (i.e. E, J, N) refers to medical or surgical category designation and character 2 refers to body system, character 3 (i.e. 0B, 0G) refers to root operation.

Additional Information Disclosures
Human subjects: All authors have confirmed that this study did not involve human participants or tissue. Animal subjects: All authors have confirmed that this study did not involve animal subjects or tissue.

Conflicts of interest:
In compliance with the ICMJE uniform disclosure form, all authors declare the following: Payment/services info: All authors have declared that no financial support was received from any organization for the submitted work. Financial relationships: All authors have declared that they have no financial relationships at present or within the previous three years with any organizations that might have an interest in the submitted work. Other relationships: All authors have declared that there are no other relationships or activities that could appear to have influenced the submitted work.