Purpose: This study assessed the performance of LLM-linked chatbots in providing accurate advice for colorectal cancer screening to both clinicians and patients. Methods: We created standardized prompts for nine patient cases varying by age and family history to query ChatGPT, Bing Chat, Google Bard, and Claude 2 for screening recommendations to clinicians. Chatbots were asked to specify which screening test was indicated and the frequency of interval screening. Separately, the chatbots were queried with lay terminology for screening advice to patients. Clinician and patient advice was compared to guidelines from the United States Preventive Services Task Force (USPSTF), Canadian Cancer Society (CCS), and the U.S. Multi-Society Task Force (USMSTF) on Colorectal Cancer. Results: Based on USPSTF criteria, clinician advice aligned with 3/4 (75.0%), 2/4 (50.0%), 3/4 (75.0%), and 1/4 (25.0%) cases for ChatGPT, Bing Chat, Google Bard, and Claude 2, respectively. With CCS criteria, clinician advice corresponded to 2/4 (50.0%), 2/4 (50.0%), 2/4 (50.0%), and 1/4 (25.0%) cases for ChatGPT, Bing Chat, and Google Bard, respectively. For USMSTF guidelines, clinician advice aligned with 7/9 (77.8%), 5/9 (55.6%), 6/9 (66.7%), and 3/9 (33.3%) cases for ChatGPT, Bing Chat, Google Bard, and Claude 2, respectively. Discordant advice was given to clinicians and patients for 2/9 (22.2%), 3/9 (33.3%), 2/9 (22.2%), and 3/9 (33.3%) cases for ChatGPT, Bing Chat, Google Bard, and Claude 2, respectively. Clinical advice provided by the chatbots stemmed from a range of sources including the American Cancer Society (ACS), USPSTF, USMSTF, and the CCS. Conclusion: LLM-linked chatbots provide colorectal cancer screening recommendations with inconsistent accuracy for both patients and clinicians. Clinicians must educate patients on the pitfalls of using these platforms for health advice.

Dr. GPT will see you now: the ability of large language model-linked chatbots to provide colorectal cancer screening recommendations / Huo, Bright; Mckechnie, Tyler; Ortenzi, Monica; Lee, Yung; Antoniou, Stavros; Mayol, Julio; Ahmed, Hassaan; Boudreau, Vanessa; Ramji, Karim; Eskicioglu, Cagla. - In: HEALTH AND TECHNOLOGY. - ISSN 2190-7188. - 4:3(2024), pp. 463-469. [10.1007/s12553-024-00836-9]

Dr. GPT will see you now: the ability of large language model-linked chatbots to provide colorectal cancer screening recommendations

Monica Ortenzi;
2024-01-01

Abstract

Purpose: This study assessed the performance of LLM-linked chatbots in providing accurate advice for colorectal cancer screening to both clinicians and patients. Methods: We created standardized prompts for nine patient cases varying by age and family history to query ChatGPT, Bing Chat, Google Bard, and Claude 2 for screening recommendations to clinicians. Chatbots were asked to specify which screening test was indicated and the frequency of interval screening. Separately, the chatbots were queried with lay terminology for screening advice to patients. Clinician and patient advice was compared to guidelines from the United States Preventive Services Task Force (USPSTF), Canadian Cancer Society (CCS), and the U.S. Multi-Society Task Force (USMSTF) on Colorectal Cancer. Results: Based on USPSTF criteria, clinician advice aligned with 3/4 (75.0%), 2/4 (50.0%), 3/4 (75.0%), and 1/4 (25.0%) cases for ChatGPT, Bing Chat, Google Bard, and Claude 2, respectively. With CCS criteria, clinician advice corresponded to 2/4 (50.0%), 2/4 (50.0%), 2/4 (50.0%), and 1/4 (25.0%) cases for ChatGPT, Bing Chat, and Google Bard, respectively. For USMSTF guidelines, clinician advice aligned with 7/9 (77.8%), 5/9 (55.6%), 6/9 (66.7%), and 3/9 (33.3%) cases for ChatGPT, Bing Chat, Google Bard, and Claude 2, respectively. Discordant advice was given to clinicians and patients for 2/9 (22.2%), 3/9 (33.3%), 2/9 (22.2%), and 3/9 (33.3%) cases for ChatGPT, Bing Chat, Google Bard, and Claude 2, respectively. Clinical advice provided by the chatbots stemmed from a range of sources including the American Cancer Society (ACS), USPSTF, USMSTF, and the CCS. Conclusion: LLM-linked chatbots provide colorectal cancer screening recommendations with inconsistent accuracy for both patients and clinicians. Clinicians must educate patients on the pitfalls of using these platforms for health advice.
2024
File in questo prodotto:
File Dimensione Formato  
Huo_Dr. GPT-will-see-you-now_2024.pdf

Solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza d'uso: Tutti i diritti riservati
Dimensione 750.48 kB
Formato Adobe PDF
750.48 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11566/328111
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact