American Academy of Orthopedic Surgery OrthoInfo provides more readable information regarding rotator cuff injury than ChatGPT

Scritto il 14/02/2025
da Catherine Hand

J ISAKOS. 2025 Feb 12:100841. doi: 10.1016/j.jisako.2025.100841. Online ahead of print.

ABSTRACT

INTRODUCTION: With over 61% of Americans seeking health information online, the accuracy and readability of this content are critical. AI tools, like ChatGPT, have gained popularity in providing medical information, but concerns remain about their accessibility, especially for individuals with lower literacy levels. This study compares the readability and accuracy of ChatGPT-generated content with information from the American Academy of Orthopedic Surgery (AAOS) OrthoInfo website, focusing on rotator cuff injuries.

METHODS: We formulated seven frequently asked questions about rotator cuff injuries, based on the OrthoInfo website, and gathered responses from both ChatGPT-4 and OrthoInfo. Readability was assessed using multiple readability metrics (Flesch-Kincaid, Gunning Fog, Coleman-Liau, SMOG Readability Formula, FORCAST Readability Formula, Fry Graph, Raygor Readability Estimate), while accuracy was evaluated by three independent reviewers. Statistical analysis included t-tests and correlation analysis.

RESULTS: ChatGPT responses required a higher education level to comprehend, with an average grade level of 14.7, compared to OrthoInfo's 11.9 (p < 0.01). The Flesch Reading Ease Index indicated that OrthoInfo's content (52.5) was more readable than ChatGPT's (25.9, p < 0.01). Both sources had high accuracy, with ChatGPT slightly lower in accuracy for the question about further damage to the rotator cuff (p < 0.05).

CONCLUSION: ChatGPT shows promise in delivering accurate health information but may not be suitable for all patients due to its higher complexity. A combination of AI and expert-reviewed, accessible content may enhance patient understanding and health literacy. Future developments should focus on improving AI's adaptability to different literacy levels.

LEVEL OF EVIDENCE: IV.

PMID:39952325 | DOI:10.1016/j.jisako.2025.100841