TY - JOUR
T1 - Evaluating the Quality of AI-Written Scenarios for Virtual Oral Surgical Board Preparatory Examination
AU - Panni, Usman Y.
AU - Donald, Christa
AU - Blatnik, Jeffrey A.
AU - Williams, Michael
AU - Yu, Jennifer
AU - Wise, Paul E.
N1 - Publisher Copyright:
© 2025 The Authors
PY - 2025/12
Y1 - 2025/12
N2 - Objective: The objective of this study was to evaluate whether ChatGPT can generate level-appropriate clinical scenarios that are suitable for use in oral board preparatory examination (mock oral exam) for senior surgical residents. Design: This was a prospective, blinded study in which AI-written and faculty-written scenarios were reviewed, randomized and used for testing in virtual mock oral exam. Both faculty examiners and test-taking residents were blinded to the true authorship of the scenarios. After the examination, participants completed a survey evaluating the complexity of each scenario and their perceptions of its authorship. Setting: The study was conducted at Washington University in St. Louis (WashU), an academic medical center located in St. Louis, Missouri. The participating institutions also included Saint Louis University (SLU). Participants: Study participants included twenty-five senior general surgery residents (PGY4 and PGY5) and twenty faculty examiners from WashU and SLU, who took part in virtual mock oral examination. Post-exam surveys were completed both residents and faculty. Results: Faculty rated most AI-written and faculty written scenarios as "level-appropriate” in terms of both the quality of the text and the degree of complexity. Similarly, when residents were asked to identify the most difficult scenarios, they selected both AI- and faculty-written scenarios at comparable rates. Notably, both faculty and residents struggled to correctly distinguish the origin of the scenarios, with frequent misidentification across both groups. Conclusion: AI-written clinical scenarios were comparable to faculty-written scenarios in terms of complexity and appropriateness for senior surgical residents when used in a virtual mock oral board examination, highlighting the potential utility of AI-based tools in oral board preparation and surgical education.
AB - Objective: The objective of this study was to evaluate whether ChatGPT can generate level-appropriate clinical scenarios that are suitable for use in oral board preparatory examination (mock oral exam) for senior surgical residents. Design: This was a prospective, blinded study in which AI-written and faculty-written scenarios were reviewed, randomized and used for testing in virtual mock oral exam. Both faculty examiners and test-taking residents were blinded to the true authorship of the scenarios. After the examination, participants completed a survey evaluating the complexity of each scenario and their perceptions of its authorship. Setting: The study was conducted at Washington University in St. Louis (WashU), an academic medical center located in St. Louis, Missouri. The participating institutions also included Saint Louis University (SLU). Participants: Study participants included twenty-five senior general surgery residents (PGY4 and PGY5) and twenty faculty examiners from WashU and SLU, who took part in virtual mock oral examination. Post-exam surveys were completed both residents and faculty. Results: Faculty rated most AI-written and faculty written scenarios as "level-appropriate” in terms of both the quality of the text and the degree of complexity. Similarly, when residents were asked to identify the most difficult scenarios, they selected both AI- and faculty-written scenarios at comparable rates. Notably, both faculty and residents struggled to correctly distinguish the origin of the scenarios, with frequent misidentification across both groups. Conclusion: AI-written clinical scenarios were comparable to faculty-written scenarios in terms of complexity and appropriateness for senior surgical residents when used in a virtual mock oral board examination, highlighting the potential utility of AI-based tools in oral board preparation and surgical education.
KW - ChatGPT
KW - artificial intelligence
KW - certifying examination preparation
KW - resident assessment
KW - scenario generation
KW - surgical education
UR - https://www.scopus.com/pages/publications/105019575773
U2 - 10.1016/j.jsurg.2025.103736
DO - 10.1016/j.jsurg.2025.103736
M3 - Article
C2 - 41125020
AN - SCOPUS:105019575773
SN - 1931-7204
VL - 82
JO - Journal of Surgical Education
JF - Journal of Surgical Education
IS - 12
M1 - 103736
ER -