Predictors of Performance on the United States Medical Licensing Examination Step 2 Clinical Knowledge: A Systematic Literature Review

In February 2020, the governing bodies of the United States Medical Licensing Examination (USMLE) announced the decision to change Step 1 score reporting from a three-digit system to pass/fail designation. Previous studies theorized that Step 2 Clinical Knowledge (CK) will become the numerical standard by which residency directors can quickly sort through program applicants. The goal of this study is to review prior research and identify significant factors associated with Step 2 CK outcomes. A systematic literature search on PubMed, Web of Science, Scopus, and ERIC that included articles published between 2005 and 2015 was conducted using the keywords “USMLE,” “Step 2 CK,” “score,” “success,” and “predictors.” After screening the initial search yield of 3,239 articles, 52 articles were included for this review. Positively correlated factors included Step 1 score, clinical block grades, Comprehensive Clinical Science Self-Assessment (CCSSA), Comprehensive Clinical Science Examination (CCSE), and volunteerism. Factors such as clerkship sequence and pass/fail grading failed to correlate with Step 2 CK. Medical College Admission Test (MCAT) score (p < 0.01) and undergraduate grade point average (GPA) (p = 0.01) positively correlated, while age displayed a negative correlation. Additionally, females typically scored higher on Step 2 CK than their male peers. The study findings suggest that continuous learning and academic success throughout medical school positively influence eventual Step 2 CK scoring. Performance on USMLE practice examinations, Step 1, and clinical evaluations serve as positive predictors for Step 2 CK scores. Interestingly, changing answers and spending more time on each question during the examination were associated with higher scores.


Introduction And Background
The United States Medical Licensing Examination (USMLE) consists of a series of required examinations for medical practice in the United States. Step 1 assesses a student's ability to apply basic science principles, and Step 2 Clinical Knowledge (CK) assesses the ability to apply medical knowledge to patient care situations in a clinical setting [1]. Passing grades for both examinations are required for medical school graduation and progression into residency [1,2].
In February 2020, the governing bodies of the USMLE, the Federation of State Medical Board (FSMB), and the National Board of Medical Examiners (NBME) announced the decision to change Step 1 score reporting from a three-digit system (1-300) to a pass/fail designation [1]. This change was designated to be implemented on January 1, 2022. With the loss of one objective method for residency distinction, it may occur that Step 2 CK will become a more important metric that residency directors will use to quickly evaluate program applicants.
It is important to identify and examine factors that have an impact on the medical students' performance in the Step 2 CK examination for both medical institutions and students. The goal of this study is to review prior research and identify significant factors positively and negatively associated with Step 2 CK outcomes. Our findings can help medical students, instructors, and medical schools ascertain the variables most likely to ensure successful performance on Step 2 CK.
This article was previously presented as a poster presentation at the 2021 APA 2021 Annual Meeting on May 2, 2021.

Methods
A systematic literature search was conducted using PubMed, Web of Science, Scopus, and ERIC. The keywords were a combination of the following: USMLE, Step-2 CK, score, success, and predictors. Our criteria included articles published within the last 15 years (2005-2020), with the most recent publication in January 2020. Additional criteria specified that selected articles must focus on USMLE Step 2 CK outcomes and include allopathic medical schools located in the United States. Duplicates and nonscientific papers were also removed ( Figure 1). Each publication was reviewed independently and summarized in a separate excel table that was later synthesized to the abovementioned PRISMA flowchart. We compared findings to filter out articles based on the exclusion criteria and resolve inconsistencies. We discuss the impact of biases after analyzing the final 52 articles and took them into consideration when evaluating the relationships between variables. Variables from the articles were categorized into either modifiable or unmodifiable factors.

FIGURE 1: PRISMA flowchart
Variables that can be altered during attendance in medical school until the first Step 2 CK attempt were considered as modifiable factors. These were further divided into individual and institutional modifiable variables. Individual factors include stress, Step 1 score, and clinical block grades. Institutional variables include clerkship sequence, faculty-to-student ratio, and pass/fail grading. Unmodifiable variables occur prior to enrollment into medical school and cannot be modified before the first Step 2 CK attempt. These variables include age, gender, race, and Medical College Admission Test (MCAT) score.

Review Results
The initial literature search yielded 3,239 articles, which were then narrowed down to 155 articles after reading the title and abstract. All articles that did not meet the criteria outlined in the Methods section were excluded. This exclusion resulted in 52 articles that were reviewed and agreed upon by all authors of this study. Most of the included articles (47) focused on modifiable factors as compared to unmodifiable factors (23). There was some overlap between the above articles.
Many of the included studies examined variables such as MCAT score, undergraduate grade point average (GPA), Step 1 score, and clinical grades. These variables were similar in that they all were strongly correlated to Step 2 CK scoring. There were also several unique variables that will be highlighted below. The above common variables, in addition to demographic variables, were controlled in many of these studies. At one California medical school, students with objective socioeconomic disadvantage (SED) and subjective selfdesignated disadvantage (SDA) had lower mean Step 2 CK scores compared to their peers [3]. However, SED itself did not have a discernible effect on Step 2 CK scoring [3]. Pass/fail preclinical curriculum did not have a significant effect on Step 2 CK scoring [4,5]. Students participating in extracurricular activities, such as free clinic volunteering, were found to have higher Step 2 CK scores compared to their peers [6]. Students involved in peer-led tutoring had significantly different (p < 0.001) Step 2 CK scores, but confounding variables were not well controlled [7]. When preparing for the examination, one study found that a larger time gap between the end of clinical rotations and taking Step 2 CK (defined as lag time) was negatively correlated to examination scores [8]. When taking the examination, time per question [9] and changing answers [10] also had a positive correlation to the Step 2 CK score. Interestingly enough, another study found that self-reported stress during the third year did not correlate with Step 2 CK outcomes [11]. A detailed summary of results can be found in the Appendix ( Table 1).

Discussion
This review has examined numerous research studies on multiple factors that have been correlated to performance in Step 2 CK. The objective of this review was to identify factors that could be addressed within the timeframe of students' medical school education, and to this end, we have divided the reported factors into modifiable factors and unmodifiable factors. While there are many studies on these variables, we have purposely decided to keep this brief to focus on modifiable factors. These modifiable factors were further grouped into variables that can be addressed on the institutional level, e.g., curriculum, and individual variables, such as individual performance in practice examinations.

Modifiable Factors
Modifiable factors can be further separated into institutional and individual variables. Institutional variables involve factors that the medical school administration can change, such as faculty and curriculum characteristics. Individual variables involve factors that the medical student can alter or consider prior to their first Step 2 CK attempt.
Individual variables: These include USMLE practice examinations, USMLE Step 1, medical school assessments, Step 2 CK test day strategies, extracurricular activities, psychological variables, time spent in medical school, lag time, and study tools.
For USMLE practice examinations, several official practice examinations are available to students preparing for both Step 1 and Step 2 CK examinations. The Comprehensive Basic Science Examination (CBSE) is offered to students preparing for Step 1. CBSE score was positively correlated to Step 2 CK [12]. However, its use as a tool to gauge Step 2 CK competency would not have added value because it is a tool for Step 1 preparation.
The two practice examinations specific to Step 2 CK are the Comprehensive Clinical Science Examination (CCSE) and the Comprehensive Clinical Science Self-Assessment (CCSSA). The CCSSA, a self-assessment web-administered examination, is a good predictor for Step 2 CK scoring [13]. In particular, low CCSSA scores often indicated the danger of Step 2 CK failure [13]. The CCSE was also a significant positive predictor of Step 2 CK scoring [14,15]. In fact, a CCSE score of greater than 90 corresponded with a near 100% probability of passing Step 2 CK [14].
According to research, the predominant opinion is that the Step 1 score has a strong positive correlation with the Step 2 CK score [15,16]. There was also a significant correlation (r = 0.684, p ≤ 0.0001) between scoring higher than 208 on Step 1 and passing Step 2 CK on the first attempt [15]. In addition, for every 10point increase in Step 1, a two-point increase in Step 2 CK was observed (p < 0.001) [12]. One study suggested that this positive relationship was stronger for males than for females [17]. The timing of the test is also an important variable to consider. One study found a unique relationship in that "students with lower MCAT scores performed better on Step 2 CK when Step 1 was after clerkships" [8]. However, in general, the researchers found that there was no significant relationship between the timing of the Step 1 test and Step 2 CK [8].
Medical school assessments included preclinical GPA, Objective Structured Clinical Examination (OSCE) evaluations, clinical block grades, and clinical NBME examination grades. Preclinical GPA had a small but positive impact on both Step 2 CK scoring and board certification [10,18]. Second-and third-year OSCE scores also had a weak correlation to Step 2 CK scoring [19]. In particular, scoring well in the differential diagnosis and identification of abnormalities skill subcomponents had a positive correlation with Step 2 CK scoring [20].
More impactful were the clinical block and NBME examination grades. Clinical block grades had a positive correlation (r = 0.517, p < 0.01) to Step 2 CK scoring [21]. Clinical NBME examination grades also had the same positive correlation (r = 0.77, p < 0.001) [22]. For example, every one-point increase on the surgical NBME led to an increase in the odds of passing Step 2 CK by 1.2 times [21]. In addition, failing and multiple attempts on the OB/GYN NBME were negatively correlated to Step 2 CK scoring (p = 0.008) [23].

Regarding
Step 2 CK test day strategies, there seems to be a positive relationship with changing answers on Step 2 CK and score [24]. In one study, 68% of students in the sample changed their answer on at least one item, and out of that 68%, 45% of the examinees increased their scores, leading to an overall improvement [24]. Furthermore, the researchers argued that more proficient examinees are more likely to review more items and are more likely to change a wrong answer to the right answer [24]. Students spending more time per question often had higher Step 2 CK scores [9]. Extracurricular activities such as volunteering, tutoring, and club leadership are often completed by medical school students to appear more competitive for residency applications. Interestingly, volunteering, such as that at a student-run free clinic, had shown to have a positive correlation to Step 2 CK scoring [6,25]. On the other hand, leadership training courses did not seem to have a significant impact on Step 2 CK scoring [26].
With respect to psychological variables, self-reported stress during the third year was not correlated to Step 2 CK score [11]. There was a negative relationship between self-designated disadvantage (SDA) and Step 2 CK [3]. SDA is classified by the AAMC through a yes/no answer on the American Medical College Application Service (AMCAS) application to the question: "Do you wish to be considered a disadvantaged applicant by any of your designated medical schools that may consider such factors (social, economic or educational)?" [27]. While the AMCAS does not define disadvantage, previous studies consider it a "proxy for psychological drivers of academic performance" [3].
A longer time spent in medical school, defined by more than four years, but not including additional programs, was seen to have a negative correlation to Step 2 CK scoring [28]. In addition, recent matriculants from the years 1978-1991 were more likely to pass NBME parts 1 and 2 (which later became Step 2 CK of the USMLE sequence) [29].
Lag time was defined by Jurich et al. in a 2020 study as the time between the end of core clerkships and the first Step 2 CK attempt. Their results showed that Step 2 CK scores declined with an increasing lag time [8]. Students with longer lag time often took Step 1 and Step 2 CK after clerkships. Students taking Step 1 after clerkships and therefore delaying their Step 2 date (lag time: ~200 days) performed worse on Step 2 CK compared to students that took Step 1 before clerkships (lag time: ~100 days) [8].
Specific study tools, such as using mechanistic case diagrams, a form of concept mapping, can help with the integration of knowledge connected to clinical reasoning. For some students, this technique may also help increase Step 2 CK scores [30].
Institutional variables: These variables comprise factors that the institutional can change, such as preclinical characteristics, clerkship characteristics, faculty characteristics, and institutional characteristics.
Preclinical characteristics are defined as attributes of the academic curriculum for an institution, such as courses and grading prior to clinical training. Over the last decade, many medical school administrations have decided to adopt a pass/fail grading scheme as opposed to a tiered categorical grading system. Some students may worry that admission to a medical institution with the tiered categorical grading scheme will be detrimental. We found that a pass/fail grading curriculum had no significant impact on Step 2 CK scoring [4,5]. However, individual preclinical examination grades did demonstrate a small effect size on Step 2 CK scoring [16].
It seems that school-specific curriculum choices also did not have a significant impact on Step 2 CK scoring. Take the example of an important first-year course -anatomy. The manner of anatomical instruction, standalone versus integrated and dissection versus dissection/prosection, did not have a significant impact on Step 2 CK scoring [31].
Studies have shown that there is not a significant relationship between the clerkship sequence and Step 2 CK scoring [32,33]. One study found that IM clerkship characteristics and community-based medicine were not significantly associated with mean Step 2 CK scores [34]. However, variables such as seeing more patients in a day during the third year were positively correlated to Step 2 CK scoring (R 2 = 0.47, p < 0.01) [34]. One study also found that students completing IM and then surgery clerkship had higher surgery subject examination scores and that a one-point increase in the surgery NBME subject examination score increased the odds of passing Step 2 CK by 1.2 times [21].
A reintroduction of basic science fundamentals also seemed to have a positive effect on Step 2 CK scoring. Medical students participating in a third-year basic science course at the University of South Carolina School of Medicine scored four points higher on Step 2 CK than their classmates that did not take the course [35].
The faculty characteristics that held relevance to Step 2 CK scoring were preclinical faculty-to-student ratio and National Institute of Health (NIH) funding statistics. Full-time faculty-to-student ratios (r = 0.35, p < 0.0004), total NIH (r = 0.46, p < 0.0001), and per faculty NIH funding (r = 0.35, p < 0.0005) were positively correlated with Step 2 CK score [36].
Variables within the category institutional characteristics include interview style, curriculum and educational policies, private versus public institution, and availability of a BA/MD program.
Studies found that private medical school students have a higher average Step 2 CK score compared to students at public medical schools [36,37]. Another study found no significant relationship between an institution's educational policy and curriculum with Step 2 CK score [38]. Participation in a BA/MD accelerated program was also not significantly associated with Step 2 CK scoring [39].
Institutions often use interviews to determine best-fit students for their prospective medical school class. These interviews may also shed light on future USMLE performance. A study found that Multiple Mini Interview (MMI) scores were associated with higher mean Step 2 CK scores. They found that a single standard deviation increase in MMI score led to a 1.25-point increase in Step 2 CK [40]. No such relationship was found for traditional interviews.

Unmodifiable Factors
Unmodifiable factors are factors determined prior to enrollment into medical school and cannot be modified before the first Step 2 CK attempt.
Medical College Admission Test (MCAT): Many studies have shown that the MCAT score was a strong predictor for Step 2 CK score [36,41]. Two studies (p = 0.04 and p < 0.001) showed that a positive relationship exists when using the older style of MCAT scoring in regard to Step 2 CK performance [42]. Studies have shown that the specific individual sections -biological sciences (BS), physical sciences (PS), and verbal reasoning (VR) -of the MCAT have relevance to the Step 2 CK performance. One study found that the strongest MCAT section predictor was the BS section (r = 0.18, p = 0.001) [10], while another noted that there was a positive and significantly correlated relationship to the MCAT VR section (p = 0.037) [12]. Yet another study found that all three sections of the MCAT positively correlated with Step 2 CK performance (p < 0.001) [43]. Specifically, Step 2 CK score increased 2.819 points for a one-point increase on BS, 0.822-point increase on PS, and 1.238-point increase on VR [43].
Furthermore, there is a negative correlation between MCAT attempts and Step 2 CK performance, such that having more attempts led to lower performance on the Step 2 CK examination (r = -0.182, p = 0.000) [23]. There also seems to be a negative correlation between the time taken during the MCAT and Step 2 CK performance given that students using extra time on the MCAT examination had lower pass rates for Step 2 CK (p < 0.001) [44].
Undergraduate GPA: Both undergraduate total GPA [36] and science GPA [10] were shown to have a positive correlation to Step 2 CK score and pass rate. When comparing both metrics, science GPA was shown to have a more significant correlation to Step 2 CK score than total GPA [10].
Demographics: Variables within this category include age, race/underrepresented minority (URM), gender, socioeconomic disadvantage (SED), and language/English as a second language (ESL).
Many studies found that older students often performed worse on Step 2 CK [10,45]. One study from The University of Toledo College of Medicine (cohort 1998-2004) found that medical students younger than 22 had an average Step 2 CK score of 220, those between 23 and 25 had an average score of 214.7, and those older than 26 had an average score of 206.5 [10]. It should be noted that the passing score for Step 2 CK was 170 in 1998 and 182 in 2004 [46].
There also seems to be a significant correlation between gender and Step 2 CK performance [17,45]. Studies have shown that women outperform men on Step 2 CK and are also more likely to pass on the initial attempt. One study developed a model that predicted women to have a score 0.34 points higher than men [45], and a study at the Meharry Medical College found that male students scored eight points less than female students [12]. Interestingly enough, studies have shown that men outperform women on Step 1 [9,38].
There also exists a relationship between race, particularly URM, and Step 2 CK scoring. One study mentioned that there is a significant association specifically with African Americans and Step 2 CK but did not explore the nature of that association [35]. However, another study involving 818 students at The University of Toledo College of Medicine did find a negative correlation, showing that African Americans tend to score lower on the Step 2 CK examination compared to their peers (p = 0.001) [10]. Some research, however, does seem to question the relationship, if any, between race and Step 2 CK performance [23]. Speaking English as the primary language seems to have a positive impact on Step 2 CK performance [9], and test-takers who have English as their secondary language often scored lower on average than their peers [45].

Conclusions
Our findings suggest that academic success starting from the undergraduate level and continuing on to medical school has a positive influence on eventual Step 2 CK scoring. Particularly important factors included the Step 1 score, USMLE practice examinations (CCSSA and CCSE), and clinical evaluations (NBME, clinical block grades, etc.). Students can also modify their behaviors during the examination; this may improve performance by increasing the time used per question and reducing the fear associated with changing answers. Interestingly, institutional characteristics such as a pass/fail versus traditional preclinical grading system did not influence Step 2 CK scoring. This should ease medical students' concerns about their program's specific grading attributes. Table 1 provides a more detailed analysis of the articles used for this paper's synthesis. This review has multiple limitations. First, many of the included factors were evaluated in a few studies, and data were not conclusive. Such factors include campus assignment, overall clerkship sequence, URM status, private versus public medical institution, lag time, stress, and leadership qualities. Some of the above variables showed statistical insignificance, while others were only discussed infrequently. Furthermore, in some of the studies, the confounding variables were not well controlled for by the study investigators. Due to the historically significant emphasis placed on Step 1, there have been many studies looking into specific study tools to increase examination performance. On the other hand, we did not find current research evaluating specific tools that positively correlate with the Step 2 CK score. Further research is needed to maximize performance on this examination and increase the chances for medical institutions to have more successful match outcomes. Respondents reported moderate levels of personal stress related to academic factors (2.0 (0.46)), and teaching and learning factors (1.9 (0.58)). Academic factors posing the severest stress were "doing well on Step 2 CK" (3.02 out of 4). Conversely, doing well on shelf examinations and "difficulty with clinical learning" posed mild stress (1.14-1.30).
Gender was a significant predictor for Step 2 CK (p < 0.001) as male students scored eight points less than female students. MCAT VR score was a positive significant A significant correlation (r = 0.572, p ≤ 0.001) was found between the score in the NBME Medicine CCSE and the score in the USMLE Step 2 CK. There was a significant correlation (r = 0.698, p ≤ 0.001) between the scores in the USMLE Step 1 and the USMLE Step 2 CK. There was a significant correlation (r = 0.684, p ≤ 0.0001) between obtaining a score of 208 or higher in the USMLE Step 1 and subsequently attaining a passing grade on the first take of the USMLE Step 2 CK. Step 1 score Women outperformed men in most Step 2 CK content areas. Specifically, in OB/GYN (observed difference = 7.6), gynecologic disorders (observed difference = 7), and disorders of pregnancy (observed difference = 5.7) with p < 0.01.
Step 1 scores and gender were significant predictors within schools.
Step 1 scores were positively related to Step 2 CK content area scores.
Step 1 scores for men were more associated with Step 2 CK scores than for women. Second-year OSCE score had weak correlations with Step 2 CK score (r = 0.14, p < 0.01).
Third-year OSCE score had weak correlations with Step 2 CK score (r = 0.14, p < 0.01). Additional USMLE Step 2 CK score variance accounted by the second-and third-year OSCE scores beyond that explained by the Step 1 score was minimal (r 2 change = 0.01). Simon et al., 2007 [20] N = 340 OSCE scores and Step 1 score Total OSCE score correlation to Step 2 CK was moderate (r = 0.395, p < 0.001).
Step 1 score accounted for 57.5% of variability. Step 1 score Step 2 CK was associated with sex, campus, and Step 1 score (p < 0.001). Women had higher Step 2 CK scores than men.
Step 1 had the strongest contribution to Step 2 CK score. Of the variance in Step 2 CK scores, 90% was due to student differences. All demographic variables under study were statistically significant.

Conflicts of interest:
In compliance with the ICMJE uniform disclosure form, all authors declare the following: Payment/services info: All authors have declared that no financial support was received from any organization for the submitted work. Financial relationships: All authors have declared that they have no financial relationships at present or within the previous three years with any organizations that might have an interest in the submitted work. Other relationships: All authors have declared that there are no other relationships or activities that could appear to have influenced the submitted work.