In-Hospital Mortality and Prediction in an Urban U.S. Population With COVID-19

Coronavirus disease 2019 (COVID-19) has touched every aspect of society, and as the pandemic continues around the globe, many of the clinical factors that influence the disease course remain unclear. A useful clinical decision-making tool is a risk stratification model to determine in-hospital mortality as defined in this study. The study was performed at Robert Wood Johnson University Hospital (RWJUH) in New Brunswick, New Jersey, USA. Data was extracted from our electronic medical records on 44 variables that included demographic, clinical, laboratory tests, treatments, and mortality information. We used the least absolute shrinkage and selection operator regression with corrected Akaike’s information criterion to identify a subset of variables that yielded the smallest estimated prediction error for the risk of in-hospital mortality. During the study period, 808 COVID-19 patients were admitted to RWJUH. The sample size was limited to patients with at least one confirmed in-house positive nasopharyngeal swab COVID-19 test. Pregnant patients or those who were transferred to our facility were excluded. Patients who were in observation and were discharged from the emergency room were also excluded. A total of 403 patients had complete values for all variables and were eligible for the study. We identified significant clinical, laboratory, and radiologic variables determining severe outcomes and mortality. An in-hospital mortality risk calculator was created after the identification of significant factors for the specific cohort, which were abnormal CT scan or chest X-ray, chronic kidney disease, age, white blood cell count, platelet count, alanine aminotransferase, and aspartate transaminase with a sensitivity, specificity, and negative predictive value of 82%, 72%, and 93%, respectively. While numerous reports from around the globe have helped outline the pandemic, demographic factors vary widely. This study is more applicable to an urban, highly diverse population in the United States.


Introduction
The pandemic caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [1], previously known as the 2019 novel coronavirus (2019-nCoV) [2], has wreaked havoc. Clinicians, scientists, data scientists, vaccine experts, public policy specialists, and others, as well as the highest levels of governments globally are focused on COVID-19 (coronavirus disease 2019) [3] as it has touched every aspect of society.
Coronaviruses are enveloped, positive-sense single-stranded RNA viruses that are classified together on the basis of the crown or halo-like appearance of the spike envelope glycoproteins [4]. The name is derived from the Latin word corona, which means crown. To date, seven human coronaviruses have been identified, and based on the published information, SARS-CoV-2 is the third zoonotic human coronavirus of the century [5]. This new agent causes symptoms ranging from a dry cough to dyspnea to a syndrome with protean manifestations including severe respiratory distress, thrombotic conditions, and other clinical problems that are still being identified [6,7].
While the first cases of COVID-19 were reported from Wuhan, China [6], it was identified in the United States (US) by the end of January 2020, initially in Washington State [8]. By the end of July of 2020, there were more than four million cases in the US with over 150,000 fatalities [9]. As the pandemic continues around the globe, many of the clinical factors that influence the disease course remain unclear. In addition, the recently available research associated with risk factors and disease severity comes from centers that lack ethnic and racial diversity [10,11]. Understanding the clinical risk factors from multi-ethnic populations to determine disease severity and outcomes is needed to improve patient management.
Early identification of hospitalized COVID-19 patients at higher risk of mortality may help ensure proper clinical care and increased survival. Liang et al. developed a clinical risk score to predict critical illness in patients hospitalized with COVID-19 [12]. While the score was developed using data on 1,590 Chinese 1, 2 3 1, 2 1, 2 1, 2 1, 2 1, 2 4, 3 patients, the average age of admitted patients was 48.9 years and an estimated 74.9% of all hospitalized patients reported no comorbidities. In contrast, the average age of hospitalized COVID-19 patients in the US is 61 years, with an estimated hypertension prevalence of 43.5% [13]. Hence, the risk score developed by Liang et al. may not fully represent the clinical experience of hospitalized patients in the US.
To address the research gap, this study analyzed a racially and ethnically diverse adult, inpatient, laboratoryconfirmed COVID-19 population at Rutgers Robert Wood Johnson University Hospital (RWJUH), a 965-bed University hospital in New Brunswick, New Jersey, USA. The patient population at RWJUH is more representative of urban areas in the US. Abstracted laboratory, demographic, and clinical data that were found to be significant were used to develop a risk stratification model to determine mortality risk.

Data source and sample selection
We conducted a cohort study of COVID-19 patients using RWJUH electronic medical records (EMR) under an IRB-approved protocol. The study included all adult (≥18 years old) COVID-19 patients who were admitted to RWJUH between January 1, 2020, and April 30, 2020. As per the Centers for Disease Control (CDC) guidelines, we identified COVID-19 cases using our EMR with the International Classification of Disease, 10th Revision, Clinical Modification (ICD-10-CM) code B97.29 for hospital discharges between January 1, 2020, and March 31, 2020, and ICD-10-CM code U07.1 for discharges that took place thereafter (n=808). Sample size was limited to patients with at least one confirmed positive nasopharyngeal swab SARS-CoV-2 test at our facility (n=593). Subsequently, pregnant patients or those who were transferred were excluded. The study excluded 72 patients who were under observation and were never admitted and an additional 45 patients who were discharged from the emergency room. A total of 403 patients with a confirmed SARS-CoV-2 test and a complete data set with variables of interest who were admitted to RWJUH inpatient services were identified ( Figure 1).

Study variables
Data were extracted manually from the EMR (SCM, Allscripts) and included demographic, clinical, laboratory, radiological, in-hospital treatments, and mortality data. Relevant comorbidities were identified. Medication lists at admission were reconciled.
All data were checked and reviewed by three different physician reviewers. The study's main outcome was in-hospital mortality for patients admitted with COVID-19. All mortality data were acquired from the EMR and were confirmed through medical chart reviews. A total of 44 variables were considered for the predictive model, including data on patients' demographics, clinical characteristics, imaging findings, and laboratory results that were collected at admission. Patients' demographics included gender, race/ethnicity, and age. Clinical characteristics considered for the predictive model included body mass index (BMI) and presenting symptoms such as fever, cough, dyspnea, anosmia, diarrhea, nausea, emesis, anorexia, malaise, and altered mental status (AMS). Prior medication use including proton-pump inhibitors (PPIs), nonsteroidal antiinflammatory drugs (NSAIDs), angiotensin-converting enzyme (ACE) or angiotensin II receptor blocker (ARB) inhibitor, insulin or oral hypoglycemics, oral steroids, calcium channel blockers (CCB), statins, or beta-blockers were incorporated.
Patients' medical histories, which included the total number and type of comorbidities, co-infections, and hospital readmission status within 30 days prior to current admission, were gathered ( Table 1). Imaging findings included results from both chest X-rays (CXRs) and computed tomography (CT) scans. We considered the following laboratory findings: white blood cells (WBCs), mean corpuscular volume (MCV), platelet count, blood urea nitrogen (BUN), creatinine (Cr), bicarbonate, albumin, total bilirubin (T Bili), alanine aminotransferase (ALT), and aspartate transaminase (AST). We also reviewed results of neutrophil, lymphocytes, D-dimer, C-reactive protein (CRP), and electrocardiography. However, those variables were not considered for the predictive model due to high proportions of patients with missing values.

Statistical analysis
Patient and hospitalization characteristics for those with versus those without in-hospital mortality were represented as numbers and percentages for categorical variables and as means and standard deviations for continuous variables. Chi-square and Student's t-tests were used to characterize the study sample according to mortality status. We quantified both crude and race, gender, and age-adjusted means for all laboratory findings by mortality status.
All patients with non-missing values were included in the development of the in-hospital mortality prediction model. Least absolute shrinkage and selection operator (LASSO) regression for variable selection and predictive model construction were utilized. The LASSO method aims to constrain the regression coefficients by shrinking their value towards zero using a shrinkage parameter. In LASSO, the shrinkage parameter λ is imposed on the sum of absolute values of the regression coefficients L1 norm. As λ increases, the values of the regression coefficients shrink toward zero. We used the LASSO regression with corrected Akaike's information criterion to identify a subset of the 44 study variables that yields the smallest estimated prediction error for the risk of in-hospital mortality. In turn, the identified variables were included in logistic regression models to determine the subset of predictive variables that were statistically significant. The accuracy of the predictive model was evaluated using the area under the receiver operating characteristic curves (AUC). The final set of predictive variables was used to estimate the probability of inhospital mortality. The optimal cut-off point was then determined to classify COVID-19 patients as with or without high risk of in-hospital mortality. To estimate this optimal cut-off point, the closest point on receiver operating curve (ROC) to the ideal prediction point was used (i.e., where sensitivity = 1 and specificity = 0).

Sensitivity analysis
The predictive model AUC accuracy was examined using the leave-one-out cross-validation method. To determine how frequently each of the 44 variables is selected, 10,000 bootstrap resamples LASSO regression with the Schwarz Bayesian Information Criterion (SBC) for variable selection were used. In turn, we examined the estimated selection frequency of the predictive variables selected in the final model as a measure of effect importance. A significance level of 0.05 for two-sided tests was considered statistically significant. All 95% confidence intervals (CIs) were reported when applicable. All analyses were performed using SAS 9.4 software (SAS Institute, Cary, NC, USA).

Sample characteristics
Of the COVID-19 patients admitted to RWJUH during the study period, 403 had non-missing values on any of the variables used in the predictive model selection process (Figure 1 At admission, patients with in-hospital mortality were more likely to have a history of chronic kidney disease (CKD), malignancy, hypertension, congestive heart failure (CHF), MI (myocardial infarction)/CAD (coronary artery disease), dementia, cerebrovascular disease, seizures, and chronic obstructive pulmonary disease (COPD). The average length of stay was 1.5 days longer for those with versus those without inhospital mortality. Patients who died during their hospital stay were more likely to be admitted to the ICU and be placed on mechanical ventilation than those who were discharged alive.
Patients' laboratory findings by mortality status are shown in Table 2. Patients with in-hospital mortality were generally characterized with abnormal laboratory results. Specifically, higher average values were seen for BUN (33.1 mg/dL vs. 20.6 mg/dL; p<0.0001), T Bili (0.62 mg/dL vs. 0.51 mg/dL; p=0.0305), and CRP (14.7 mg/dL vs. 12.0 mg/dL; p=0.023) for those with, relative to those without, in-hospital mortality. In contrast, platelet count and albumin were significantly lower in those who died in-hospital than patients who were discharged alive. After adjustment for age, gender, and race, values for platelet count, BUN, albumin, and CRP remained significantly different between the two groups.

Unadjusted Adjusted a
In-Hospital Mortality  a Adjusted for race, gender, and age. b From t-test for the comparisons between those with versus without in-hospital mortality. c From linear regression for the comparisons between those with versus without in-hospital mortality.
Note: Data are presented as mean (SD)

Predictive model selection
A total of 44 variables were included in the model selection process using LASSO regression. The LASSO regression selected 21 variables using corrected Akaike's information criterion. Variables selected for predicting in-hospital mortality included gender, race, age, co-infections, readmission within the past 30 days, abnormal CT scan or CXR, medical history including CKD, malignancy, CHF, cerebrovascular disease, dementia, prior PPI, NSAID, or beta-blocker use, WBCs, platelet count, albumin, T Bili, ALT, and AST. Using the 21 LASSO selected variables yielded an AUC of 0.85 (95% CI: 0.81-0.89) for predicting in-hospital mortality.
Of the 21 variables selected by the LASSO method, only abnormal CT scan or CXR, CKD, age, WBCs, platelet count, ALT, and AST remained significant predictors of COVID-19 in-hospital morality using logistic regression models ( Table 3). As a result, these seven variables were included in the COVID-19 in-hospital mortality prediction model. The AUC for predicting in-hospital mortality using these seven variables was 0.82 (95% CI: 0.78-0.87) (Figure 2).  Using the shortest distance between the ROC and the ideal prediction point, the optimal cut-off for the probability of COVID-19 in-hospital mortality was 0.229. At this cut-off point, the model has a sensitivity, specificity, and negative predictive value of 82%, 72%, and 93%, respectively ( Figure  3).

Sensitivity analysis
Using the leave-one-out cross-validation method, the AUC for predicting in-hospital mortality was 0.80 (95% CI: 0.75-0.85). Results from the 10,000 bootstrap resamples LASSO regression with the SBC for variable selection were consistent with our variables selection. As such, age was selected in 90.8% of the LASSO identified models, whereas abnormal CT scan or CXR, CKD, WBCs, platelet count, ALT, and AST were selected 50.0%, 58.3%, 46.6%, 90.2%, 72.5%, and 90.2%, respectively.

Discussion
COVID-19 has rapidly become a leading focus of medical care in the United States and globally. A study from New York City of 1,150 hospitalized adults, of whom 257 (22%) were critically ill, showed older age, chronic cardiac disease, COPD, higher serum levels of interleukin-6, and D-dimer to be associated independently with mortality [14]. A meta-analysis from China of 8,697 patients showed the most commonly experienced symptoms were fever and cough [15]. The International Severe Acute Respiratory and Emerging Infections Consortium World Health Organization (ISARIC WHO) Clinical Characterization Protocol of 20,133 patients in the United Kingdom showed a four-day median duration between onset of symptoms and hospitalization [16]. In this study, the most common comorbidities were chronic cardiac disease, uncomplicated diabetes, non-asthmatic chronic pulmonary disease, and CKD. Independent risk factors for mortality were increasing age, male sex, and obesity. Older males may have a higher case fatality rate than females, perhaps due to differential expression of ACE2 receptors and TMPRSS2, a serine protease needed for spike protein priming [17]. Additional important factors may be sex hormone-driven innate and adaptive immune responses and immunoaging [18].
There is mounting evidence that the GI tract and the liver are also targets for viral entry [19]. The ACE2 receptor has been confirmed as an entry receptor [20]. The spike glycoprotein (S-protein) is instrumental in virus attachment and receptor recognition [21]. ACE2 receptors have are expressed in multiple areas of the GI tract, including the esophagus, ileum, and colon, as well as cholangiocyte [22]. In addition to nausea, vomiting, anorexia, and diarrhea as manifestations of GI involvement, liver enzyme abnormalities have been noted frequently [23]. Interestingly, our study found that GI symptoms of diarrhea and nausea had an inverse relationship with mortality ( Table 1).
There have been variations in some of the findings reported in the literature. Mortality rates may be affected by variations in national healthcare delivery systems [24,25], variations in inpatient population demographic factors such as race and ethnicity [26], and variations in socioeconomic conditions [27]. The US has a multiethnic, multi-racial population in most large urban areas. This study offers further information in this context.
Our study reflected demographic factors more characteristic of the diversity of urban USA ( Table 1). An estimated 33% of the patients admitted were Hispanic, 32.8% were White, 13.4% were Asian, and 10.9% were African-American; 62% were male and 38% were female. Risk factors affecting mortality included older age and AMS, whereas there was an inverse relationship with presenting symptoms of diarrhea or nausea. In our cohort, those who died had a significantly higher use of NSAIDs (10% vs. 3.84%, p=0.024) and statins (42.2% vs. 30.4%, p=0.035). Many studies have reported PPI [28,29] use as a risk factor, and although our data show a similar trend, it did not reach statistical significance.
Important comorbidities increasing risk of mortality included CKD, history of malignancy, hypertension, CHF, and MI/CAD, as well as the presence of dementia, cerebrovascular disease, history of seizures, and presence of COPD. We did not find diabetes to be an increased risk factor. Some of these findings may have been reflective of our patient population; i.e., the mean age of Caucasians who died was significantly higher than that of Hispanics and African-Americans and, therefore, perhaps reflected increased comorbidity burden. Interestingly, BMI seemed not to be a factor in mortality risk. Understandably, ICU admission, mechanical ventilation, and length of stay were correlated with mortality ( Table 1).
As shown in Table 2 From the analyses, a multivariable logistic regression model for predicting in-hospital mortality was developed with variables and model coefficients ( Table 3). Additionally, a score calculator was developed ( Figure 3). Seven variables in the model that were utilized were abnormal CXR or CT findings (yes or no), CKD (yes or no), age in years, WBCs, platelet count, ALT, and AST. This study has limitations due to the relatively small sample size and its retrospective nature. Follow-up data on outpatients and their recovery were not available as the cohort was limited to inpatients. There may be an underestimate of mortality due to the limited time period studied. However, the strengths include information on a diverse patient population reflective of urban hospitals in the US, in distinction to many other parts of the world, and, therefore, generalizability.
The analyses in this paper add to the factors that help define in-hospital mortality based on initial laboratory values and comorbidities. These factors include increasing age, presenting symptoms of diarrhea and nausea, NSAID and statin use, AMS, CKD, history of malignancy, COPD, dementia and cerebrovascular disease, and history of seizures. Laboratory values of concern include lower admission platelet counts and lower albumin values, as well as higher BUN and CRP levels.

Conclusions
The treatment for COVID-19 is rapidly evolving, and the mortality rates and outpatient care will improve accordingly. Simple medications such as dexamethasone are already making a difference. The data in this case-control study are helpful in defining the epidemiology of this pandemic. Furthermore, the calculator in this article may be of benefit in the interim in terms of triage decisions. This is of vital importance, as has been mentioned, due to capacity constraints and the resulting ethical implications and decisions.

Additional Information Disclosures
Human subjects: Consent was obtained by all participants in this study. New Brunswick Health Sciences IRB issued approval Pro2020000870. Animal subjects: All authors have confirmed that this study did not involve animal subjects or tissue. Conflicts of interest: In compliance with the ICMJE uniform disclosure form, all authors declare the following: Payment/services info: All authors have declared that no financial support was received from any organization for the submitted work. Financial relationships: All authors have declared that they have no financial relationships at present or within the previous three years with any organizations that might have an interest in the submitted work. Other relationships: All authors have declared that there are no other relationships or activities that could appear to have influenced the submitted work.