Machine learning-based noninvasive diagnostic classifiers for the prediction of cancer tissue of origin using serum microRNAs.

Authors

null

Andrew Zhang

Yale University, New Haven, CT

Andrew Zhang , Hallgeir Rui , Hai Hu

Organizations

Yale University, New Haven, CT, Thomas Jefferson University, Philadelphia, PA, Chan Soon-Shiong Institute of Molecular Medicine at Windber, Windber, PA

Research Funding

No funding sources reported

Background: Noninvasive multi-cancer early detection (MCED) with or without tissue of origin (TOO) has the potential to reduce cancer-related mortality by analyzing circulating cell-free nucleic acids and/or proteins in blood. Accurate prediction of TOO following a positive MCED test would guide selection of confirmatory tests, thereby expediting the definitive diagnosis and prompt initiation of the most appropriate treatment, tailored to the specific cancer type. Here, we report the development of machine learning-based diagnostic classifiers that predict TOO for 13 cancer types with high accuracy using serum microRNAs. Methods: Eight serum miRNA microarray datasets from GEO totaling 6,283 patients across 13 cancer types were used in this study. The patients were split, with an approximate 3:2 ratio, into a training (n=3,844) and a validation set (n=2,439). An ensemble of classifiers was constructed in the training set via the “one vs. rest” approach, thus one classifier for each cancer type. Random forest models with recursive feature elimination (RFE) selected the optimal set of miRNAs that was fed into support vector machine models to generate a prediction probability for each cancer type. The type with the highest probability was considered the predicted cancer type. The performance of these classifiers was evaluated in the validation set in two steps with the 1st using all cancer types and the 2nd using the top 2 or 3 cancer types from the 1st step to achieve a refined prediction. Results: RFE selected 426 miRNAs for building the 13 classification models. In the validation set comprising 2,439 patients across 12 cancer types, the classifiers correctly predicted cancer types for 1,922 (79%) samples based on the highest prediction probability. The accuracy increased to 92% and 95% based on top 2 and 3 predictions. In particular, based on top 3 predictions, the accuracy was >95% for bladder, breast, prostate, gastric, glioma and lung cancers, >85% for ovarian, liver and esophageal cancers, 78% for pancreatic cancer and sarcoma, and 67% for colorectal cancer. Conclusions: With 95% accuracy in narrowing TOO down to 3 organ sites, the miRNA-based TOO classifiers could be used clinically as a reflex test for the simple and highly accurate MCED screening models previously developed (Cancers 2022,14:1450; ESMO 2024). Together, they support the development of an inexpensive, accurate and noninvasive blood test for MCED with TOO.

Disclaimer

This material on this page is ©2024 American Society of Clinical Oncology, all rights reserved. Licensing available upon request. For more information, please contact licensing@asco.org

Abstract Details

Meeting

2024 ASCO Breakthrough

Session Type

Poster Session

Session Title

Poster Session B

Track

Thoracic Cancers,Breast Cancer,Gynecologic Cancer,Head and Neck Cancer,Hematologic Malignancies,Genetics/Genomics/Multiomics,Healthtech Innovations,Models of Care and Care Delivery,Viral-Mediated Malignancies,Other Malignancies or Topics

Sub Track

Early Detection and Surveillance

Citation

J Clin Oncol 42, 2024 (suppl 23; abstr 101)

DOI

10.1200/JCO.2024.42.23_suppl.101

Abstract #

101

Poster Bd #

C1

Abstract Disclosures

Similar Abstracts

Abstract

2023 ASCO Annual Meeting

Comparison of multi-omics biomarkers via liquid biopsy in early detection of gastric cancer.

First Author: Xuefei Wang

First Author: D'Ambra Dent

First Author: Sara Beltran Ponce

First Author: Mao Mao