Appropriateness and Reliability of an Online Artificial Intelligence Platform's Responses to Common Questions Regarding Distal Radius Fractures

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Purpose: Chat Generative Pre-Trained Transformer (ChatGPT) is a novel artificial intelligence chatbot that is changing the way humans gather information online. The purpose of this study was to investigate ChatGPT's ability to appropriately and reliably answer common questions regarding distal radius fractures. Methods: Thirty common questions regarding distal radius fractures were presented in an identical manner to the online ChatGPT-3.5 interface three separate times, yielding 90 unique responses because ChatGPT produces an original answer with each query. All responses were graded as “appropriate,” “appropriate but incomplete,” or “inappropriate” by a consensus discussion among three hand surgeon reviewers. The questions were additionally subcategorized into one of four domains based on Bloom's cognitive learning taxonomy, and descriptive statistics were reported. Results: Seventy of the 90 total responses (78%) produced by ChatGPT were “appropriate,” and 29 of the 30 questions (97%) had at least one response considered appropriate (of the three possible). However, only 17 of the 30 questions (57%) were answered appropriately on all three iterations. The test–retest reliability of ChatGPT was poor with an intraclass correlation coefficient of 0.12. Finally, ChatGPT performed best answering questions requiring lower-order thinking skills (Bloom's levels 1–3) and less well on level 4 questions. Conclusions: This study found that although ChatGPT has the capability to answer common questions regarding distal radius fractures, caution should be taken before implementing its use, given ChatGPT's inconsistency in providing a complete and accurate response to the same question every time. Clinical relevance: As the popularity and technology of ChatGPT continue to grow, it is important to understand the potential and limitations of this platform to determine how it may be best implemented to improve patient care.

Original languageEnglish
Pages (from-to)91-98
Number of pages8
JournalJournal of Hand Surgery
Volume49
Issue number2
DOIs
StatePublished - Feb 2024

Keywords

  • Artificial intelligence
  • ChatGPT
  • distal radius fractures
  • patient education

Fingerprint

Dive into the research topics of 'Appropriateness and Reliability of an Online Artificial Intelligence Platform's Responses to Common Questions Regarding Distal Radius Fractures'. Together they form a unique fingerprint.

Cite this