Development and validation of a prediction model for the recurrence of stage 1 EGFR mutation positive NSCLC in patients using machine learning with WES-based gene sets.

Authors

Akiko Tateishi

Akiko Tateishi

Department of Thoracic Oncology, National Cancer Center Hospital, Tokyo, Japan

Akiko Tateishi , Hidehito Horinouchi , Ken Takasawa , Nobuji Kouno , Takaaki Mizuno , Yu Okubo , Yukihiro Yoshida , Shun-ichi Watanabe , Mototaka Miyake , Masahiko Kusumoto , Koji Inaba , Hiroshi Igaki , Yasushi Yatabe , Masami Mukai , Katsuya Tanaka , Naoki Mihara , Kouya Shiraishi , Takashi Kohno , Yuichiro Ohe , Ryuji Hamamoto

Organizations

Department of Thoracic Oncology, National Cancer Center Hospital, Tokyo, Japan, Division of Medical AI Research and Development, National Cancer Center Research Institute, Tokyo, Japan, Department of Experimental Therapeutics, National Cancer Center Hospital, Tokyo, Japan, Department of Thoracic Surgery, National Cancer Center Hospital, Tokyo, Japan, Division of Thoracic Surgery, National Cancer Center Hospital, Tokyo, Japan, Department of Diagnostic Radiology, National Cancer Center Hospital, Tokyo, Japan, Department of Radiation Oncology, National Cancer Center Hospital, Japan, Tokyo, Japan, Department of Radiation Oncology, National Cancer Center Hospital, Tokyo, Japan, Department of Diagnostic Pathology, National Cancer Center Hospital, Tokyo, Japan, Division of Medical Informatics, National Cancer Center Hospital, Tokyo, Japan, Division of Genome Biology, National Cancer Center Research Institute, Tokyo, Japan, National Cancer Center Hospital, Tokyo, Japan

Research Funding

No funding received
None.

Background: Predicting the risk of postoperative recurrence is becoming increasingly important in EGFR mutation positive (EGFR-m) non-small cell lung cancer (NSCLC). Only a few reports conducted whole exome sequencing (WES) for genomic profiling in large scale EGFR-m NSCLC cases. Methods: We conducted WES and analyzed the clinical course of patients (pts) with NSCLC who underwent surgery between 1985 and 2019 in the PRISM project of our institute. We evaluated the EGFR-m patients’ characteristics, recurrence-free survival (RFS), and developed a prediction model using machine learning, Overlapping Group LASSO to predict whether the EGFR-m patients will recur within 5 years or not as "high-risk" and "low-risk". To develop and validate the prediction model, we divided the data from 2006 as the training cohort data set and the data before 2006 as the validation cohort data set. After the development of the prediction model, we performed a stratified Cox proportional hazards model to compare the RFS between the predicted groups. Results: A total of 585/1351(43.3%) pts were EGFR-m included in the PRISM project. Of all pts, stage I, EGFR-m were 205 pts. The median RFS of the stage I, EGFR-m 123.1 months (m). In the training cohort, the model detected 43 pts for high-risk and 88 pts for low-risk. In the validation cohort, the model predicted 29 pts for high-risk and 45 pts for low-risk. Median RFS of high-risk vs. low-risk were 52.2 m vs. 105 m (HR 2.43, p=0.01). The 1-year RFS rate, 2-year RFS rate, and 5-year RFS rates for high-risk and low-risk pts were 100% vs. 98.5%, 66.7% vs. 89.2%, and 41.7% vs. 66.1%, respectively. Twenty-eight gene set coefficients were non-zero in the prediction model. The gene sets with large positive coefficients that were considered important for prediction as "high risk" were the gene sets affected by KRAS gene overexpression and p53 gene knockdown. In contrast, gene sets repressed by mTOR inhibition and genes repressed by TBK1 gene knockdown and KRAS gene overexpression had large negative coefficients. Conclusions: We developed and validated the prediction model whether the EGFR-m patients will recur within 5 years or not. EGFR-m NSCLC recurrence appears to be high risk with pathways associated with KRAS and p53 genes, and low risk with mutations in the gene sets suppressed by the mTOR pathway and TBK1 gene.

Training cohort (n=131)Validation cohort (n=74)
High risk (n=43)Low risk (n=88)High risk (n=29)Low risk (n=45)
median RFS (m), 95%CI32.8 [23.5-44.8]NR [NR-NR]52.2 [23.7-NR]105 [73.1-NR]
HR, p-value-2.43, p = 0.011
1-year RFS rate (%)89 [81-98]100 [100-100]100 [100-100]98.5 [95.5-100]
2-year RFS rate (%)60 [49-75]100 [100-100]66.7 [44.7-99.5]89.2 [82.0-97.1]
5-year RFS rate (%)0100 [100-100]41.7 [21.3-81.4]66.1 [55.5-78.7]

Disclaimer

This material on this page is ©2024 American Society of Clinical Oncology, all rights reserved. Licensing available upon request. For more information, please contact licensing@asco.org

Abstract Details

Meeting

2023 ASCO Annual Meeting

Session Type

Poster Session

Session Title

Lung Cancer—Non-Small Cell Local-Regional/Small Cell/Other Thoracic Cancers

Track

Lung Cancer

Sub Track

Local-Regional Non–Small Cell Lung Cancer

Citation

J Clin Oncol 41, 2023 (suppl 16; abstr 8563)

DOI

10.1200/JCO.2023.41.16_suppl.8563

Abstract #

8563

Poster Bd #

190

Abstract Disclosures