TY - JOUR
T1 - Evaluation of Rhinoplasty Information from ChatGPT, Gemini, and Claude for Readability and Accuracy
AU - Meyer, Monica K.Rossi
AU - Kandathil, Cherian Kurian
AU - Davis, Seth J.
AU - Durairaj, K. Kay
AU - Patel, Priyesh N.
AU - Pepper, Jon Paul
AU - Spataro, Emily A.
AU - Most, Sam P.
N1 - Publisher Copyright:
© Springer Science+Business Media, LLC, part of Springer Nature and International Society of Aesthetic Plastic Surgery 2024.
PY - 2024
Y1 - 2024
N2 - Objective: Assessment of the readability, accuracy, quality, and completeness of ChatGPT (Open AI, San Francisco, CA), Gemini (Google, Mountain View, CA), and Claude (Anthropic, San Francisco, CA) responses to common questions about rhinoplasty. Methods: Ten questions commonly encountered in the senior author’s (SPM) rhinoplasty practice were presented to ChatGPT-4, Gemini and Claude. Seven Facial Plastic and Reconstructive Surgeons with experience in rhinoplasty were asked to evaluate these responses for accuracy, quality, completeness, relevance, and use of medical jargon on a Likert scale. The responses were also evaluated using several readability indices. Results: ChatGPT achieved significantly higher evaluator scores for accuracy, and overall quality but scored significantly lower on completeness compared to Gemini and Claude. All three chatbot responses to the ten questions were rated as neutral to incomplete. All three chatbots were found to use medical jargon and scored at a college reading level for readability scores. Conclusions: Rhinoplasty surgeons should be aware that the medical information found on chatbot platforms is incomplete and still needs to be scrutinized for accuracy. However, the technology does have potential for use in healthcare education by training it on evidence-based recommendations and improving readability. Level of Evidence V: This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.springer.com/00266.
AB - Objective: Assessment of the readability, accuracy, quality, and completeness of ChatGPT (Open AI, San Francisco, CA), Gemini (Google, Mountain View, CA), and Claude (Anthropic, San Francisco, CA) responses to common questions about rhinoplasty. Methods: Ten questions commonly encountered in the senior author’s (SPM) rhinoplasty practice were presented to ChatGPT-4, Gemini and Claude. Seven Facial Plastic and Reconstructive Surgeons with experience in rhinoplasty were asked to evaluate these responses for accuracy, quality, completeness, relevance, and use of medical jargon on a Likert scale. The responses were also evaluated using several readability indices. Results: ChatGPT achieved significantly higher evaluator scores for accuracy, and overall quality but scored significantly lower on completeness compared to Gemini and Claude. All three chatbot responses to the ten questions were rated as neutral to incomplete. All three chatbots were found to use medical jargon and scored at a college reading level for readability scores. Conclusions: Rhinoplasty surgeons should be aware that the medical information found on chatbot platforms is incomplete and still needs to be scrutinized for accuracy. However, the technology does have potential for use in healthcare education by training it on evidence-based recommendations and improving readability. Level of Evidence V: This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.springer.com/00266.
KW - Artificial intelligence
KW - Large language models
KW - Rhinoplasty
UR - http://www.scopus.com/inward/record.url?scp=85204027647&partnerID=8YFLogxK
U2 - 10.1007/s00266-024-04343-0
DO - 10.1007/s00266-024-04343-0
M3 - Article
C2 - 39285054
AN - SCOPUS:85204027647
SN - 0364-216X
JO - Aesthetic Plastic Surgery
JF - Aesthetic Plastic Surgery
ER -