Skip to main content

Advertisement

Advertisement

ADVERTISEMENT

News

Accuracy and Appropriateness of Alopecia Areata Information Obtained From ChatGPT

According to a study published in Dermatology, the utilization of a large language model within ChatGPT outputted mostly appropriate information for common patient concerns regarding alopecia areata (AA).

Researchers aimed to assess the appropriateness and accuracy of responses generated by ChatGPT, specifically ChatGPT 3.5 and ChatGPT 4.0, to common patient questions about AA. Patients often turn to various sources for information about AA, including artificial intelligence-based tools like ChatGPT, making it crucial to evaluate the quality of information provided.

The research involved presenting 25 common patient questions related to AA to attending dermatologists in an academic center. These dermatologists assessed the responses generated by ChatGPT 3.5 and ChatGPT 4.0 for both appropriateness and accuracy. The appropriateness of the responses was evaluated in 2 hypothetical contexts: for patient-facing general information websites and for electronic health record (EHR) message drafts.

The study found that the overall accuracy score for responses was 4.41 out of 5, indicating a high level of accuracy. ChatGPT 4.0 generated responses with a slightly higher mean accuracy score of 4.53 compared to ChatGPT 3.5, which had a mean accuracy score of 4.29. In terms of appropriateness, both ChatGPT versions provided mostly appropriate information, with 100% of responses rated as appropriate for general questions. However, for questions related to management in an EHR message draft, the appropriateness rating was slightly lower at 79%.

Interestingly, the dermatologists largely preferred the responses generated by ChatGPT 4.0 over ChatGPT 3.5, indicating an improvement in response quality with the newer iteration.

Reviewer agreement, as measured by Fleiss' κ coefficient, was moderate across all questions, with a 53.7% agreement rate, indicating a reasonable level of consensus among the dermatologists.

“While not all responses were accurate, the trend toward improvement with newer iterations suggests potential future utility for patients and dermatologists,” the authors concluded.

Reference
O'Hagan R, Kim RH, Abittan BJ, Caldas S, Ungar J, Ungar B. Trends in accuracy and appropriateness of alopecia areata information obtained from a popular online large language model, ChatGPT. Dermatology. Published online September 18, 2023. doi:10.1159/000534005

 

© 2023 HMP Global. All Rights Reserved.
Any views and opinions expressed are those of the author(s) and/or participants and do not necessarily reflect the views, policy, or position of The Dermatologist or HMP Global, their employees, and affiliates. 

Advertisement

Advertisement

Advertisement