Real-Time Peer-to-Peer Observation and Feedback Lead to Improvement in Oral Presentation Skills

Background Oral case presentation is a vital skill in many fields, particularly in medicine, and is taught early on in medical schools. However, there is a diminished focus on the development of this skill during the clinical years. In this study, we investigated whether the implementation of a formal teaching strategy during the internal medicine clerkship rotation can lead to an improvement in oral presentation skills. Methodology Students received an introductory PowerPoint lecture and saw brief video presentations summarizing the key components of a successful oral presentation. Subsequently, students were asked to evaluate their peers while they presented during morning rounds using a standardized feedback form in the first and the second half of their rotation. Using the information gained from the feedback form, students provided verbal feedback on the quality of oral presentations to their peers. Results A total of 64 students participated in this curriculum at a university-affiliated teaching hospital, and a total of 409 evaluations were completed. The average total score during the first and the second rotation period was 93.0% (standard deviation, SD = 9.8) and 96.9% (SD = 7.1), respectively. Improvement in the total score of 3.7% points was seen in the entire cohort, with an average improvement of 64% (or 1.64 times) in the probability of obtaining a full score during the second rotation. Conclusions Our data show improvement in scores between collection blocks using an educational strategy. This study emphasizes the fact that peer-to-peer evaluations helped in the refinement of oral presentation skills.


Introduction
Oral presentation is a skill that requires concerted effort and practice to serve its desired goal. However, it remains an understudied area in the teaching curricula [1]. Although there is no universal definition of oral case presentation, some key characteristics of high-quality presentations in internal medicine have been described [2,3]. While most students are aware of the standards for oral presentations at times they struggle with execution. Students can also have a different perception compared to their attendings regarding the goal of oral presentation. While the former view the activity as a data collection task allowing information to be delivered for interpretation, the latter view it as an opportunity for constructing an argument for or against specific diagnoses and management [4]. However, the potential advantage of peer-to-peer feedback is that it is informal and occurs in a nonjudgemental manner compared to the attending feedback where such pressures may be perceived by students.
Furthermore, feedback and expectation regarding oral presentations are nonspecific and variable at times. This lack of implicit expectations and varying attending preferences confuse medical students and delay the acquisition of skills [4]. The available literature supports the notion that most students learn by trial and error, suggesting that transparent expectations should be emphasized [4]. Peer-to-peer feedback for improvement of oral presentation competencies has been suggested with potential gains in providing and receiving feedback among students and alleviating some feedback duties for the attending [1]. In the case of teaching attendings, peer observation has recently been shown to lead to improvement in teaching behaviors [5]. However, there is sparse data assessing the use of student peer-to-peer evaluation of the oral presentations carried out during morning rounds. We developed a curriculum for internal medicine clerkship students that emphasizes this skill development as an active learning process along with its improvement through peer-to-peer feedback in real-time.
In this study, our primary objective was to examine whether student-to-student observation and feedback can lead to improvement in oral presentation skills in the inpatient setting.

Materials And Methods
This study was conducted at a university-affiliated tertiary care teaching hospital. Our local Institutional Review Board determined that our study was a quality improvement (QI) educational research and was exempt from full approval. The internal medicine clerkship is an eight-week rotation, with four continuous weeks completed at the study hospital. The curriculum was implemented from April 2018 through December 2018. At the beginning of the rotation during the orientation session, the students were given a talk titled "Guidance for Improving Oral Case Presentation." This talk was delivered using a PowerPoint presentation prepared by the investigators along with a showing of the two selected educational resource videos on the topic (poor presentation and good plan presentation) to highlight the contrast between a well-narrated and a disorganized presentation from the reference material [6]. The presentation aimed to explain and emphasize the elements of a good and successful oral presentation and the common pitfalls thereof. This talk helped introduce the curriculum to the students and set their expectations. Overall, an average of 40-50 minutes were spent on this effort, followed by answering any questions students had. Figures 1-3 show the PowerPoint slides used in the initial orientation [6].

Peer-to-peer evaluation tool and process of peer observation
The students were divided into four groups which differed by their rotation schedule. Group 1 involved students who were performing their clinical rotation for the first time (least experienced), while Group 4 had experience with oral presentations from their previous clinical rotations (most experienced). We used the domains for oral presentation as laid down and validated in earlier studies for our student peer-to-peer evaluation tool [3,7]. However, we added a Likert scale to score the tool, ranging from 1 (never) to 5 (always), where 5 was considered the best possible desired outcome for that particular metric. The following six different metrics were evaluated: chief complaint, patient illness history, physical examination, laboratory data, assessment, and plan. In addition, space for free comments was left for each of the metrics, and the total number of comments for each presentation was counted at the end. Figures 4, 5 show the student peerto-peer evaluation tool used in this study. study.
During the first two-week rotation, students were given six evaluation forms to assess their peers' presentation performance. Another set of identical evaluations was provided for their second two-week rotation. They were encouraged but not mandated to fill out all the forms. Evaluation sets were marked by team number, and then within each team, evaluations were labeled as "Student A" or "Student B." Student pairs self-assigned one another as "Student A or B" and kept this designation consistent throughout four weeks. This was done to maintain the anonymity of the students for the investigators.
Each student would listen to their peer's oral presentation in the morning and fill out the evaluation form at the same time. They were later expected to provide oral feedback to each other once the round was finished based on their evaluation of the forms filled out earlier. This feedback activity was neither coached nor witnessed by the teaching attending on the teams. Each week students were anticipated to complete between one to three sessions of giving and receiving feedback per student. Because the student participation in these evaluations was voluntary, the number of returned evaluations per student was not the same, even within the same presenter during different rotation periods. Thus, proportions (the obtained score divided by the total possible score by period) rather than just the obtained score were used for the outcomes.

Statistical analysis
The abovementioned six metrics and the differences between the latter and initial rotation were summarized by the mean and standard deviation (SD). The outcomes were summarized all together and by group and criterion. The number of comments was summarized by the median and interquartile range due to the extreme skewness in the distribution. Due to the small sample size, preventing model convergence of a (frequentist) beta mixed model, Bayesian beta mixed models with random intercepts were performed for each of the six outcomes independently, as well as to assess the differences between rotations after adjusting for group membership. Because some observations had full scores, the Lemon-Squeezer transformation was used to slightly shift the outcome so that full score observations could be handled by the beta mixed model. This transformation y' is defined as y' = [y(N -1) + ½]/N, where y is the outcome of interest and N is the sample size [8]. Using Markov Chain Monte Carlo (MCMC) sampling, 10,000 observations were used for the burn-in period, followed by sampling 20,000 observations. Model convergence was assessed via trace plots and the multivariate potential scale reduction factor (PSRF). Based on the trace plots and multivariate PSRF, all the Bayesian beta mixed models converged. The mean of the conditional posterior distribution of odds ratios (multiplicative increase in the probability of obtaining a perfect score) along with 95% credible intervals was provided for each of the outcomes.
Posterior distributions summarize what we know about a parameter (i.e., odds ratios) by combining prior knowledge with information obtained from the data in the current study. Sampling from the posterior distribution (in this case, via MCMC sampling) forms the basis of most kinds of practical Bayesian Inference today. Thus, the posterior mean is the mean of the (20,000) samples after burn-in. Credible intervals (Bayesian) are analogous to confidence intervals used in frequentist statistics but have the advantage of having a more intuitive interpretation than confidence intervals. For credible intervals, it can be said that: "The probability the parameter of interest (i.e., odds ratios) falls within the credible interval is 95%" rather than "If we were to repeat the study over many times we would expect on average the parameter to fall within the confidence interval 95% of the time." Observations with missing data were omitted from the analysis. The statistical analysis was performed using R version 3.6.1.

Results
A total of 64 students were evaluated on their presentations, with each having anywhere between one to six peer evaluation forms filled out per presentation. Table 1 summarizes the six outcomes in each group and the differences between the first and second weeks of rotations.    The average total score during the first rotation period was 93.0% (SD = 9.8), and it was 96.9% (SD = 7.1) during the second period. There was an improvement in the total cohort score of 3.7 percentage points (SD = 11.3). During the initial presentation, students on average performed best in "Physical Examination" with a mean score of 96.3% (SD = 7.7). In the latter presentation, students overall performed the best on "Plan" with a mean score of 98.2% (SD = 4.5). Therefore, "Plan" had the largest improvement overall with 5.9 percentage points (SD = 10.6), while "Physical Examination" had the smallest improvement overall with 0.4 percentage points (SD = 9.3). Table 2 provides the mean odds ratios from the posterior distribution and their 95% credible intervals in the total score obtained between the two periods overall and for each of the six criteria.  Bold indicates that the posterior mean odds ratio does not contain 1.

Outcome
The odds ratio in a beta regression is interpreted as the multiplicative factor of change in the probability of obtaining a full score in the latter presentation period compared to the first presentation period. Hence, there was a noticeable average improvement of 64% (or 1.64 times) overall in the probability of obtaining a full score in the latter presentation period compared to the first presentation period. "Patient Illness History," "Assessment," and "Plan" also showed noticeable improvement with a 49%, 79%, and 63% increase, respectively.
There were different patterns of change in the score when analyzed by the group. While "Plan" and "Patient Illness History" consistently improved between the four groups, others like "Chief Complaints," "Physical Examination," and "Laboratory Data" do not as some groups do worse in the latter presentation period. Table 1 shows that the patterns of improvement or worsening are not consistent across groups. Groups 1 and 3 have a very high level of missingness (43.8% and 62.5% missing, respectively), while Group 2 has some missingness (12.5%), and Group 4 has no missing observations. Groups 1 and 3 show little improvement overall (0.4 and 0.5 percentage points, respectively), while Groups 2 and 4 show some improvement (6.1 and 5.1 percentage points, respectively). As shown in Table 1, there is little overall difference in the number of comments between the two instances of presentation for all groups. The largest increase in comments between groups was in Group 3 with a median difference of 2.0 comments, while Group 1 had fewer comments in the latter presentation (median difference of 1). While there were some presentations with over 20 comments, the majority had little to none. Overall, there were a total of 524 comments recorded by students.

Discussion
Our study shows performance improvement in the oral presentation when comparing peer-to-peer evaluations in the last two weeks to the first two weeks of the rotation. This lends credence to the suggestion that a formal peer-to-peer evaluation curriculum during internal medicine clerkship may facilitate improvement in oral presentation skills. We designed our curriculum to empower students to understand that the clinical oral presentation is a true skill that required continued practice and edition. Hence, this should not be left to the traditional "trial and error" process. This was done by promoting peer-to-peer feedback which remains an unexploited source of feedback in medical school education. However, this process is successfully being utilized in nursing education [9].
The improved performance seen could be secondary to active participation by students as they were becoming more mindful about the evaluation and feedback. In the process of peers observing each other, they become more active listeners rather than "zoning out" when not presenting themselves on the rounds. Hence, active listening results in improved performance. Again, this awareness of being observed by their peers led to enhancement in "Patient Illness History," "Assessment," and "Plan." As a learning tool, reflection has been shown to result in the acquisition of new and lasting skills, and we think that by implementing our curriculum, we are using the same concept [10].
Currently, there is a lack of a good model that projects the rhetorical and linguistic skills to produce and deliver a good presentation [11]. Investigators have attempted a nonconventional curriculum (video recordings) for the acquisition of oral presentation skills by students [1]. When medical students were given