Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin's Clinical Research Center for Cancer, Tianjin Key Laboratory of Cancer Prevention and Therapy, Tianjin, Tianjin, China
Lan Lan , Yuman Zhang , Xing Li , Kai Wang
Background: Identifying the origin of tumors is important for the treatment and prognosis of cancer patients. Despite of routine examinations including medical history, laboratory tests, histopathological assessments and imaging, 3-5% of metastatic cancers are diagnosed with cancers of unknown primary (CUP). In clinical practice, the management of CUP is challenging and empirical chemotherapy remains to be the main treatment option, leading to poor prognosis. Methods: To predict the tumor origin, we built a weighted ensemble model by Automated machine learning (autoML), including 13 models, using molecular and clinical datasets consisted of 22,018 patients and 28tumor types. The tumor type of each patient was confirmed by at least 3 pathologists. Our molecular and clinical datasets contained 7 features including age, tumor site, gene alteration, tumor mutation burden (TMB), microsatellite instability (MSI), etc. Features selected by AutoML were used to train the diagnostic model by using algorithms including CatBoost, ExtraTreesEntr (Extremely randomized trees), ExtraTreesGini, KNeighborsDist, KNeighborsUnif, LightGBM, XGBoost, etc. The best model was selected by autoML. We then optimized the model parameters and developed a No-Bias algorithm to ensure that the trained model can output "fair and unbiased" prediction results even when the input dataset had an uneven distribution in the number of cases for each cancer type. 5-fold cross validation was used to estimate the accuracy of the diagnostic models. Moreover, we validated the accuracy of diagnostic model using an independent validation set of 1,636 patients covering 28 tumor types. Results: Our model outputs the most confidently predicted tumor types in the form of TOP N (N=1-5). In the training set, the accuracy of TOP1, TOP3, and TOP5 of our diagnostic model was 72.8%, 88.4%, and 93.3%, respectively. had been successfully applied to support the diagnosis of 10 CUP patients. In one patient who was predicted to have esophageal squamous cell carcinoma, immune checkpoint inhibitor combined with chemotherapy regimen was given and achieved a 9-mouths PFS. Conclusions: We successfully developed a deep learning-based diagnostic model for predicting tumor origin with genomic and clinical data. The model could be applied to improve the clinical diagnosis of CUP patients and guide the treatment decision according to the predicted tumor origin.
Disclaimer
This material on this page is ©2024 American Society of Clinical Oncology, all rights reserved. Licensing available upon request. For more information, please contact licensing@asco.org
Abstract Disclosures
2023 ASCO Gastrointestinal Cancers Symposium
First Author: Haley Ellis
2024 ASCO Gastrointestinal Cancers Symposium
First Author: Thierry Andre
2024 ASCO Gastrointestinal Cancers Symposium
First Author: Ming Lei
2020 ASCO Virtual Scientific Program
First Author: Katherine I. Zhou