Relevance and accuracy of ChatGPT-generated NGS reports with treatment recommendations for oncogene-driven NSCLC.

Authors

null

Zac Hamilton

University of Illinois Chicago, Chicago, IL

Zac Hamilton , Noor Naffakh , Natalie Marie Reizine , Frank Weinberg , Shikha Jain , Vijayakrishna K. Gadi , Christopher Bun , Ryan Huu-Tuan Nguyen

Organizations

University of Illinois Chicago, Chicago, IL, Department of Hematology Oncology University of Illinois of Chicago College of Medicine, Chicago, IL, CancerIQ, Chicago, IL

Research Funding

Other Foundation
RHN is a recipient of the Robert A. Winn Diversity in Clinical Trials Career Development Award, funded by Bristol Myers Squibb Foundation

Background: Next-generation sequencing (NGS) is a routine clinical practice in advanced NSCLC. NGS reports are information-dense and clinical interpretation remains a challenge. ChatGPT is a large language model (LLM) AI chatbot that can generate text in response to user-generated prompts. We sought to assess the clinical relevance and accuracy of ChatGPT-generated NGS reports with first-line (1L) treatment recommendations for NSCLC patients with targetable driver oncogenes. Methods: Eight driver oncogenes with FDA-approved targeted treatment for 1L stage IV NSCLC were identified in the latest NCCN Clinical Practice Guidelines available to the AI model (version 5, September 2021). The prompt, “Create a next-generation sequencing report with a list of first-line treatment options for a patient with stage IV non-small cell lung cancer with an [oncogenic driver].” was run in a separate “new chat” 10 times for each driver oncogene (n=80). Each ChatGPT output was recorded and scored. The Relevance Score (RS) awarded 1 point for every NCCN preferred option and 0.5 points for each “other recommended” treatment listed in the AI-generated output, divided by the maximum possible score for the driver oncogene. Spurious recommendations were awarded 0 points. The Accuracy Score (AS) represents reported treatment options listed in NCCN over the total number of treatments in a report. Percentage of reports listing an NCCN-preferred 1L therapy, a clinical trial as an option, and character and word count were also captured. Results: The average length of the AI-generated NGS reports was 117 words (range: 44 – 232). The median number of treatments recommended was 5 (range: 3 – 8). An oncogenic driver-specific preferred 1L treatment was included in 55 reports (68.8%), and a recommendation to explore clinical trials was listed in 43 reports (53.8%). The RS for the total sample was 0.59 (95% CI: 0.52 – 0.65), and the AS was 46.0% (95% CI: 40.2% – 51.8%). Conclusions: ChatGPT can rapidly generate concise NGS reports with treatment options for NSCLC with driver oncogenes. Recommendation relevance was moderate, and accuracy was limited with high variability across oncogenes. Overall, ChatGPT recommendations were promising given the complexity of the task with no prompting or training provided to the AI. As LLM AI platforms mature, they may generate more relevant and accurate NGS reports, offering a potentially valuable tool for NGS report annotation for clinicians, and increased accessibility for patients.

OncogeneRSStd DevAS (%)Std Dev (%)
EGFR exon 19 del.0.490.2059.918.6
EGFR exon 21 L858R mut.0.460.1557.011.4
ALK rearrangement0.750.1373.621.9
ROS1 rearrangement0.860.1370.217.5
BRAF V600E mut.0.670.0020.32.8
NTRK1/2/3 gene fusion1.000.0048.711.1
Metex14 skipping mut.0.100.197.313.4
RET rearrangement0.370.1531.211.4
Total0.590.3146.0%26.6%

Disclaimer

This material on this page is ©2024 American Society of Clinical Oncology, all rights reserved. Licensing available upon request. For more information, please contact licensing@asco.org

Abstract Details

Meeting

2023 ASCO Annual Meeting

Session Type

Poster Session

Session Title

Care Delivery and Regulatory Policy

Track

Care Delivery and Quality Care

Sub Track

Clinical Informatics/Advanced Algorithms/Machine Learning

Citation

J Clin Oncol 41, 2023 (suppl 16; abstr 1555)

DOI

10.1200/JCO.2023.41.16_suppl.1555

Abstract #

1555

Poster Bd #

149

Abstract Disclosures