American Academy of Orthopedic Surgery OrthoInfo provides more readable information regarding meniscus injury than ChatGPT-4 while information accuracy is comparable

Scritto il 23/02/2025

da Camden Bohn

J ISAKOS. 2025 Feb 21:100843. doi: 10.1016/j.jisako.2025.100843. Online ahead of print.

ABSTRACT

INTRODUCTION: Over 61% of Americans seek health information online, often using artificial intelligence (AI) tools like ChatGPT. However, concerns persist about the readability and accessibility of AI-generated content, especially for individuals with varying health literacy levels. This study compares the readability and accuracy of ChatGPT responses on meniscus injuries with those from the American Academy of Orthopedic Surgeons' OrthoInfo website, which is tailored for patient education. We hypothesize that while ChatGPT offers accurate information, its readability will be lower than OrthoInfo.

METHODS: Seven frequently asked questions about meniscus injuries were used to compare responses from ChatGPT-4 and OrthoInfo. Readability was assessed using multiple calculators (Flesch-Kincaid, Gunning Fog, Coleman-Liau, SMOG Readability Formula, FORCAST Readability Formula, Fry Graph, Raygor Readability Estimate), and accuracy was evaluated by three independent reviewers on a 4-point scale. Statistical analysis included independent t-tests to compare readability and accuracy between the two sources.

RESULTS: ChatGPT responses required a significantly higher education level to comprehend, with an average reading grade level of 13.8 compared to 9.8 for OrthoInfo (p < 0.01). The Flesch Reading Ease Index also indicated lower readability for ChatGPT (32.0 vs. 59.9, p < 0.01). However, both ChatGPT and OrthoInfo responses were highly accurate, with all but one ChatGPT response receiving the highest accuracy rating of 4. The response to physical exam findings was less accurate (3.3 vs. 3.6, p = 0.52).

CONCLUSION: While AI-generated responses were accurate, their readability made them less accessible than OrthoInfo, which is designed for a broad audience. This study underscores the importance of clear, accessible information for meniscal injuries and suggests that AI tools should incorporate readability metrics to enhance patient comprehension. Despite the potential of AI, resources like OrthoInfo remain essential for effectively communicating health information to the public.

LEVEL OF EVIDENCE: IV.

PMID:39988021 | DOI:10.1016/j.jisako.2025.100843