Purpose: This study assessed the performance of LLM-linked chatbots in providing accurate advice for colorectal cancer screening to both clinicians and patients. Methods: We created standardized prompts for nine patient cases varying by age and family history to query ChatGPT, Bing Chat, Google Bard, and Claude 2 for screening recommendations to clinicians. Chatbots were asked to specify which screening test was indicated and the frequency of interval screening. Separately, the chatbots were queried with lay terminology for screening advice to patients. Clinician and patient advice was compared to guidelines from the United States Preventive Services Task Force (USPSTF), Canadian Cancer Society (CCS), and the U.S. Multi-Society Task Force (USMSTF) on Colorectal Cancer. Results: Based on USPSTF criteria, clinician advice aligned with 3/4 (75.0%), 2/4 (50.0%), 3/4 (75.0%), and 1/4 (25.0%) cases for ChatGPT, Bing Chat, Google Bard, and Claude 2, respectively. With CCS criteria, clinician advice corresponded to 2/4 (50.0%), 2/4 (50.0%), 2/4 (50.0%), and 1/4 (25.0%) cases for ChatGPT, Bing Chat, and Google Bard, respectively. For USMSTF guidelines, clinician advice aligned with 7/9 (77.8%), 5/9 (55.6%), 6/9 (66.7%), and 3/9 (33.3%) cases for ChatGPT, Bing Chat, Google Bard, and Claude 2, respectively. Discordant advice was given to clinicians and patients for 2/9 (22.2%), 3/9 (33.3%), 2/9 (22.2%), and 3/9 (33.3%) cases for ChatGPT, Bing Chat, Google Bard, and Claude 2, respectively. Clinical advice provided by the chatbots stemmed from a range of sources including the American Cancer Society (ACS), USPSTF, USMSTF, and the CCS. Conclusion: LLM-linked chatbots provide colorectal cancer screening recommendations with inconsistent accuracy for both patients and clinicians. Clinicians must educate patients on the pitfalls of using these platforms for health advice.
Dr. GPT will see you now: the ability of large language model-linked chatbots to provide colorectal cancer screening recommendations / Huo, Bright; Mckechnie, Tyler; Ortenzi, Monica; Lee, Yung; Antoniou, Stavros; Mayol, Julio; Ahmed, Hassaan; Boudreau, Vanessa; Ramji, Karim; Eskicioglu, Cagla. - In: HEALTH AND TECHNOLOGY. - ISSN 2190-7188. - 4:3(2024), pp. 463-469. [10.1007/s12553-024-00836-9]
Dr. GPT will see you now: the ability of large language model-linked chatbots to provide colorectal cancer screening recommendations
Monica Ortenzi;
2024-01-01
Abstract
Purpose: This study assessed the performance of LLM-linked chatbots in providing accurate advice for colorectal cancer screening to both clinicians and patients. Methods: We created standardized prompts for nine patient cases varying by age and family history to query ChatGPT, Bing Chat, Google Bard, and Claude 2 for screening recommendations to clinicians. Chatbots were asked to specify which screening test was indicated and the frequency of interval screening. Separately, the chatbots were queried with lay terminology for screening advice to patients. Clinician and patient advice was compared to guidelines from the United States Preventive Services Task Force (USPSTF), Canadian Cancer Society (CCS), and the U.S. Multi-Society Task Force (USMSTF) on Colorectal Cancer. Results: Based on USPSTF criteria, clinician advice aligned with 3/4 (75.0%), 2/4 (50.0%), 3/4 (75.0%), and 1/4 (25.0%) cases for ChatGPT, Bing Chat, Google Bard, and Claude 2, respectively. With CCS criteria, clinician advice corresponded to 2/4 (50.0%), 2/4 (50.0%), 2/4 (50.0%), and 1/4 (25.0%) cases for ChatGPT, Bing Chat, and Google Bard, respectively. For USMSTF guidelines, clinician advice aligned with 7/9 (77.8%), 5/9 (55.6%), 6/9 (66.7%), and 3/9 (33.3%) cases for ChatGPT, Bing Chat, Google Bard, and Claude 2, respectively. Discordant advice was given to clinicians and patients for 2/9 (22.2%), 3/9 (33.3%), 2/9 (22.2%), and 3/9 (33.3%) cases for ChatGPT, Bing Chat, Google Bard, and Claude 2, respectively. Clinical advice provided by the chatbots stemmed from a range of sources including the American Cancer Society (ACS), USPSTF, USMSTF, and the CCS. Conclusion: LLM-linked chatbots provide colorectal cancer screening recommendations with inconsistent accuracy for both patients and clinicians. Clinicians must educate patients on the pitfalls of using these platforms for health advice.File | Dimensione | Formato | |
---|---|---|---|
Huo_Dr. GPT-will-see-you-now_2024.pdf
Solo gestori archivio
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza d'uso:
Tutti i diritti riservati
Dimensione
750.48 kB
Formato
Adobe PDF
|
750.48 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.