Development and validation of a natural language processing algorithm using electronic health record data to identify patients with breast cancer with low social support.

Authors

null

Candyce Kroenke

Division of Research, Kaiser Permanente Northern California, Pleasanton, CA

Candyce Kroenke , Rhonda Aoki , Lauren Mammini , David Cronkite , Stacey Alexeeff , Salene M. W. Jones , Lawrence H. Kushi , Shaila Strayhorn , Jessica Mogk , David Mosen , David Carrell

Organizations

Division of Research, Kaiser Permanente Northern California, Pleasanton, CA, Kaiser Permanente Washington Health Research Institute, Seattle, WA, Fred Hutchinson Cancer Center, Seattle, WA, The University of North Carolina Wilmington, Wilmington, NC, Kaiser Permanente Center for Health Research, Portland, OR

Research Funding

National Cancer Institute

Background: Social support is important to the management of breast cancer treatment. Our team has developed data from electronic health record (EHR) data into structured ‘concept groups’ that will form the basis for the development of EHRsupport, a computable, EHR-based measure of social support. We report the evaluation of these concept groups against chart review as a part of our validation. Methods: We built a natural language processing (NLP) algorithm on clinical notes in 7,989 women diagnosed from January 2006 to September 2021 with invasive breast cancer. We identified and developed 10 concept groups from unstructured data: 1) living situation, 2) marital/partner status, 3) parenthood status, 4) visit support (accompanied patient to ≥1 visit, patient attended alone at ≥1 visit), 5) friends/other support, 6) explicit positive or negative mentions of social support, 7) mention of a deceased person, 8) transportation issues, 9) relationship conflict or stress, and 10) social isolation. We validated concept groups against the charts of 100 patients randomly drawn from the broader patient population (nonoverlapping with the training data set) also around the time (-1 to +3 months) of diagnosis. Results: Concept group data availability ranged from 1.3% social isolation to 98.3% for living situation. Specificity and negative predictive values were moderate to high for all concept groups. Sensitivity and positive predictive values were moderate to high for concept groups with high data availability (Table) and lower for concept groups with low data availability. Conclusions: Data on social support have been available since the advent of Epic in 2006 and our NLP-based algorithm accurately captured data within the EHR that were systematically collected supporting the development of a clinical tool that can be used to identify patients at risk of low social support.

EHRsupport concept groups, data availability (n=7,989), and evaluation vs. chart review.

Availability (%)SensitivitySpecificityPPVNPV
Living situation (e.g., alone or not)98.31009261100
Partner/spouse92.08510010081
Parenthood status88.895809383
Visit support82.992779659
Positive mentions social support*46.97510010070
Negative mentions social support33996796
Friend/other support46.881817983
Deceased person39.986979590
Transportation issues14.680852299
Relationship conflict/stress7.725952994

*Availability of positive and negative mentions of social support.

Disclaimer

This material on this page is ©2024 American Society of Clinical Oncology, all rights reserved. Licensing available upon request. For more information, please contact licensing@asco.org

Abstract Details

Meeting

2024 ASCO Quality Care Symposium

Session Type

Poster Session

Session Title

Poster Session B

Track

Health Care Access, Equity, and Disparities,Technology and Innovation in Quality of Care,Survivorship

Sub Track

Use of IT/Analytics to Improve Quality

Citation

JCO Oncol Pract 20, 2024 (suppl 10; abstr 421)

DOI

10.1200/OP.2024.20.10_suppl.421

Abstract #

421

Poster Bd #

G23

Abstract Disclosures