Improving serious illness conversations in oncology: A machine learning approach that integrates natural language processing for mortality prediction.

Authors

null

Prathamesh Parchure

Institute for Healthcare Delivery Science, Icahn School of Medicine at Mount Sinai, New York, NY

Prathamesh Parchure, Marcos Vargas, Livingston Graham, Min-heng Wang, Ksenia O. Gorbenko, Madhu Mazumdar, Cardinale B. Smith, Arash Kia

Organizations

Institute for Healthcare Delivery Science, Icahn School of Medicine at Mount Sinai, New York, NY, SUNY Downstate Health Sciences University, Brooklyn, NY, Touro College of Osteopathic Medicine, Middletown, NY, Icahn School of Medicine at Mount Sinai, New York, NY, Institute for Healthcare Delivery Science, Tisch Cancer Institute, Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY

Research Funding

U.S. National Institutes of Health
U.S. National Institutes of Health

Background: ML-based mortality prediction tools in oncology can optimize clinical decisions and prompt end-of-life care discussions. Patients with advanced cancer who have engaged in Goals of Care (GoC) conversations report improved quality of life and better care alignment. However, oncologists often have overly optimistic prognoses and miss timely GoC discussions. Clinical notes are a valuable source of information, but processing and extracting data from them is time-consuming and labor-intensive. To address this issue, we have developed a machine learning application that ingests clinical notes and structured data from electronic health records (EHRs) to generate a 180-day mortality risk, prompting oncologists for GoC conversations. Methods: A predictive machine learning model was developed using data from cancer patients aged 21 and above, diagnosed between January 2016 and December 2021. Data was collected from various sources, including cancer and death registry and the EHR. By analyzing structured and unstructured data from ambulatory progress notes, a clinical profile was created for each patient. The model utilized Spark-NLP for preprocessing, applying word2vec embedding and pre-trained NER models to extract information on diseases, symptoms, procedures, treatments, and medications. Feature engineering techniques were used to select the best NLP features, combined with structured data. The model was trained using 894 patients, employing Random Forest Classifier with 10-fold cross-validation, and tested on a separate set of 43,274 patients. Performance evaluation included ROC AUC, PR AUC, and F1 Score metrics. Results: After the fine tuning, the best model showed an AUC-ROC of 0.88 on the train set and 0.75 on the test set. At a threshold of 0.44, the model achieved a balanced performance with a sensitivity of 0.70 and specificity of 0.71 on the testing set. Conclusions: Our team pioneered the development of an automated multi-modality pipeline that combines unstructured real-world data with structured data, allowing for training and testing of a fusion model. This automation opens doors for scaling and dissemination, to enhance mortality prediction. Future works will involve qualitative analysis of implementation and acceptance in clinical practice.

Disclaimer

This material on this page is ©2024 American Society of Clinical Oncology, all rights reserved. Licensing available upon request. For more information, please contact licensing@asco.org

Abstract Details

Meeting

2023 ASCO Quality Care Symposium

Session Type

Poster Session

Session Title

Poster Session B

Track

Health Care Access, Equity, and Disparities,Technology and Innovation in Quality of Care,Palliative and Supportive Care

Sub Track

Use of IT/Analytics to Improve Quality

Citation

JCO Oncol Pract 19, 2023 (suppl 11; abstr 590)

DOI

10.1200/OP.2023.19.11_suppl.590

Abstract #

590

Poster Bd #

N13

Abstract Disclosures

Similar Abstracts

Abstract

2024 ASCO Quality Care Symposium

Impact of a mortality prediction tool on end-of-life (EOL) quality measures.

First Author: Jody S. Garey