Sequential learning for pan-tumor detection of metastatic disease progression.

Authors

null

Foad H. Green

Syapse, San Francisco, CA

Foad H. Green , Matthew J. Rioth , Joshua Loving

Organizations

Syapse, San Francisco, CA

Research Funding

No funding received
None.

Background: Detection of progression of patients with early stage cancer is a challenge for real world data from electronic health records (EHR). Temporal patterns common to patients’ data can detect patients with metastatic progression. A deep learning approach on longitudinal patient records of cancer lab testing, visit diagnosis codes, and cancer treatments was used to identify metastatic status. Methods: ICD-10-CM diagnosis codes, clinical lab values specific to cancer testing, and antineoplastic treatments were evaluated from longitudinal data of 128,614 patients who did not progress and 5,884 who did develop metastasis, across 47 tumor types in the Syapse Learning Health Network. Patients metastatic at diagnosis were excluded. Data were included beginning at the time of primary cancer diagnosis and ended for all surviving patients at an established administrative cutoff date. In the metastatic group, visits after the progression event were included but censored if they contained diagnosis codes specifically for secondary malignant neoplasms. A binary classification was developed using a multi-headed attention transformer in order to determine metastatic status. Patient history was sequentially aggregated to visit-level embeddings across their longitudinal record. Categorical layers were used for antineoplastics and diagnosis codes. Linear layers were used to embed lab values, as well as for visit intervals from primary cancer date and administrative cutoff date. All embeddings were used as feature inputs for the classifier. Pre-training random sampling of the non-metastatic population was applied to establish equally-weighted labels in a 80:10:10 split for training, validation, and testing sets; to account for the smaller population of patients who progressed to metastatic status. Results: This classification approach achieved 0.86 PR-AUC, 0.79 ROC-AUC, and 0.75 F1. This performance is comparable to published models trained for single tumor cohort prediction. With our censoring constraints, these model results are robust in the absence of routine signals of metastasis in the EHR, such as staging reports and diagnosis coding for distant metastases. Conclusions: This method generalizes, across multiple cancer types, an accurate classification of metastatic progression from patient visit history. This work is immediately useful for real world evidence data analysis complementing patients with metastases already captured in hospital registries. Using sequential deep learning with EHR data classes, this approach may be used to forecast metastatic progression for early intervention.

Disclaimer

This material on this page is ©2024 American Society of Clinical Oncology, all rights reserved. Licensing available upon request. For more information, please contact licensing@asco.org

Abstract Details

Meeting

2023 ASCO Annual Meeting

Session Type

Publication Only

Session Title

Publication Only: Care Delivery and Regulatory Policy

Track

Care Delivery and Quality Care

Sub Track

Clinical Informatics/Advanced Algorithms/Machine Learning

Citation

J Clin Oncol 41, 2023 (suppl 16; abstr e13591)

DOI

10.1200/JCO.2023.41.16_suppl.e13591

Abstract #

e13591

Abstract Disclosures

Similar Abstracts