TY - JOUR
T1 - ChatGPT-3.5 Responses Require Advanced Readability for the General Population and May Not Effectively Supplement Patient-Related Information Provided by the Treating Surgeon Regarding Common Questions About Rotator Cuff Repair
AU - Eng, Emma
AU - Mowers, Colton
AU - Sachdev, Divesh
AU - Yerke-Hansen, Payton
AU - Jackson, Garrett R.
AU - Knapik, Derrick M.
AU - Sabesan, Vani J.
N1 - Publisher Copyright:
© 2024 Arthroscopy Association of North America
PY - 2024
Y1 - 2024
N2 - Purpose: To investigate the accuracy of ChatGPT's responses to frequently asked questions prior to rotator cuff repair surgery. Methods: The 10 most common frequently asked questions related to rotator cuff repair were compiled from 4 institution websites. Questions were then input into ChatGPT-3.5 in 1 session. The provided ChatGPT-3.5 responses were analyzed by 2 orthopaedic surgeons for reliability, quality, and readability using the Journal of the American Medical Association Benchmark criteria, the DISCERN score, and the Flesch-Kincaid Grade Level. Results: The Journal of the American Medical Association Benchmark criteria score was 0, indicating the absence of reliable source material citations. The mean Flesch-Kincaid Grade Level was 13.4 (range, 11.2-15.0). The mean DISCERN score was 43.4 (range, 36-51), indicating that the quality of the responses overall was considered fair. All responses cited making final decision-making to be made with the treating physician. Conclusions: ChatGPT-3.5 provided substandard patient-related information in alignment with recommendations from the treating surgeon regarding common questions around rotator cuff repair surgery. Additionally, the responses lacked reliable source material citations, and the readability of the responses was relatively advanced with a complex language style. Clinical Relevance: The findings of this study suggest that ChatGPT-3.5 may not effectively supplement patient-related information in the context of recommendations provided by the treating surgeon prior to rotator cuff repair surgery.
AB - Purpose: To investigate the accuracy of ChatGPT's responses to frequently asked questions prior to rotator cuff repair surgery. Methods: The 10 most common frequently asked questions related to rotator cuff repair were compiled from 4 institution websites. Questions were then input into ChatGPT-3.5 in 1 session. The provided ChatGPT-3.5 responses were analyzed by 2 orthopaedic surgeons for reliability, quality, and readability using the Journal of the American Medical Association Benchmark criteria, the DISCERN score, and the Flesch-Kincaid Grade Level. Results: The Journal of the American Medical Association Benchmark criteria score was 0, indicating the absence of reliable source material citations. The mean Flesch-Kincaid Grade Level was 13.4 (range, 11.2-15.0). The mean DISCERN score was 43.4 (range, 36-51), indicating that the quality of the responses overall was considered fair. All responses cited making final decision-making to be made with the treating physician. Conclusions: ChatGPT-3.5 provided substandard patient-related information in alignment with recommendations from the treating surgeon regarding common questions around rotator cuff repair surgery. Additionally, the responses lacked reliable source material citations, and the readability of the responses was relatively advanced with a complex language style. Clinical Relevance: The findings of this study suggest that ChatGPT-3.5 may not effectively supplement patient-related information in the context of recommendations provided by the treating surgeon prior to rotator cuff repair surgery.
UR - http://www.scopus.com/inward/record.url?scp=85196486653&partnerID=8YFLogxK
U2 - 10.1016/j.arthro.2024.05.009
DO - 10.1016/j.arthro.2024.05.009
M3 - Article
C2 - 38777000
AN - SCOPUS:85196486653
SN - 0749-8063
JO - Arthroscopy - Journal of Arthroscopic and Related Surgery
JF - Arthroscopy - Journal of Arthroscopic and Related Surgery
ER -