Validating the use of machine-learning cancer staging algorithms for Medicare cost analyses.

Authors

Rebecca Smith

Milliman, Inc., New York, NY

Rebecca Smith , Lesley-Ann Miller-Wilson , Gebra Cuyun Carter , Ifrah Fayyaz , Andreyah Pope , Bruce Pyenson

Organizations

Milliman, Inc., New York, NY, Exact Sciences Corporation, Madison, WI

Research Funding

Pharmaceutical/Biotech Company

Exact Sciences Corporation

Background: Administrative claims provide valuable real-world insight into the care of cancer patients; however, claims data lacks cancer stage information. This limitation constrains research on the value of early diagnosis and treatment as well as on the costs and savings associated with increased cancer screening. In prior work, our team used the SEER-Medicare data to develop machine learning (ML) algorithms to stage non-small cell lung (NSCLC), colon (CC), and rectal (RC) cancer patients using clinical flags derived from claims data. These algorithms were 69% (RC), 78% (NSCLC), and 83% (CC) accurate at matching incident cancer patients with their SEER-recorded AJCC stage (SEER-stage) at diagnosis. This work sought to test whether these ML algorithms are sufficiently accurate for use in claim cost analyses. Methods: Incident NSCLC, CC, and RC patients were identified using 2016-2017 SEER-Medicare data and assigned a cancer stage using a claims-based predictive multinomial logistic regression model (R Statistical Software - v4.1.2 R Core Team 2021; nnet package - Venables and Ripley 2002). Patients’ cumulative medical and pharmacy costs were summarized for 12 months starting with patients’ index month. Patients’ Medicare index month was set equal to the month of their first claim with a cancer diagnosis. Patients’ SEER index month was set equal to the diagnosis month associated with their incident tumor record in SEER. Patients with each cancer type were then grouped two ways - by ML-stage and by SEER-stage. Median patient costs were compared between stage groups for each cancer type and differences tested for statistical significance using Wilcoxon Rank-Sum Testing. Results: For NSCLC and CC, raw differences in median 12-month claim costs between the ML- and SEER-stage cohorts were small (1%-4%). Cost differences for RC were higher (7%-17%). ML and SEER costs were not significantly different (p > 0.05) between later-stage cohorts (NSCLC stages 3 and 4, CC stages 2C/3 and 4, and RC stage 4); however, early-stage groups were always significantly different (p < 0.05). Conclusions: Although costs were not statistically equivalent across all stage groups, the similarity of ML and SEER costs across higher-stage cohorts and small raw differences in median costs for each NSCLC and CC group suggests that ML algorithms with higher accuracy may be used to develop costs from administrative data for stage shift modeling and cost tradeoff analyses.

Cancer	Stage	Sample Size	Median Cost (ML)	Median Cost (SEER)	Δ (ML-SEER)	p-value
NSCLC	0/1/2	4,904	$46,188	$48,024	-$1,836	0.005
NSCLC	3	2,700	$70,975	$72,345	-$1,370	0.433
NSCLC	4	5,835	$61,177	$63,392	-$2,215	0.053
CC	0/1/2A/2B	3,638	$37,020	$38,230	-$1,209	0.013
CC	2C/3	2,125	$56,722	$58,838	-$2,116	0.071
CC	4	1,336	$67,624	$70,323	-$2,698	0.237
RC	0/1/2A/2B	650	$43,895	$53,685	-$9,790	0.000
RC	2C/3	517	$66,976	$78,292	-$11,316	0.000
RC	4	247	$65,879	$70,723	-$4,845	0.115

Disclaimer

Abstract Details

Meeting

2023 ASCO Annual Meeting

Session Type

Publication Only

Session Title

Publication Only: Care Delivery and Regulatory Policy

Track

Care Delivery and Quality Care

Sub Track

Clinical Informatics/Advanced Algorithms/Machine Learning

Citation

J Clin Oncol 41, 2023 (suppl 16; abstr e13548)

DOI

10.1200/JCO.2023.41.16_suppl.e13548

Abstract #

e13548

Abstract Disclosures

FEATURED

Validating the use of machine-learning cancer staging algorithms for Medicare cost analyses.

Authors

Rebecca Smith

Organizations

Research Funding

Abstract Details

Meeting

Session Type

Session Title

Track

Sub Track

Citation

DOI

Abstract #

Similar Abstracts

Abstract

Comparison of molecular testing rates and biomarker positivity by histology among patients (pts) with stage IV non-small cell lung cancer (NSCLC): A quality initiative by Integra Connect PrecisionQ.

Abstract

Association of enrollment in Medicare Advantage plans versus fee-for-service in patients with a late-stage diagnosis of gynecologic cancer.

Abstract

LungFlag, a machine-learning (ML) personalized tool for assessing lung cancer risk in a community setting, to evaluate performance in flagging non-small cell lung cancer (NSCLC) regardless of sex or race.

Abstract

Multimodal machine learning model prediction of “individual” response to immunotherapy in 1L stage IV NSCLC.