Pitfalls with analyses of real-world data: A look at ASCO’s CancerLinQ Discovery Multiple Myeloma dataset.

Authors

null

Karen J Shou

Walter Reed National Military Medical Center, Bethesda, MD

Karen J Shou , Jennifer A Thornton , Kevin Sunderland , Christin DeStefano

Organizations

Walter Reed National Military Medical Center, Bethesda, MD, David Grant USAF Medical Center, Fairfield, CA

Research Funding

No funding received
None.

Background: Real world data (RWD) are increasingly used in oncology research. Yet, a big limitation of RWD is missing data, potentially generating misleading conclusions. Methods for handling missing data include excluding patients with missing variables, using machine-based or statistically-imputed values, or using proxies (surrogates). Another limitation deals with excluding deaths at time zero, which may lead to misleading conclusions when analyzing survival of patients with aggressive cancers. This work highlights the impacts data exclusion, variable surrogacy, and death at time zero have on survival analysis results. Methods: ASCO’s CancerLinQ Discovery Multiple Myeloma (MM) dataset was used to assess overall survival (OS) in patients with MM diagnosed from 2009-2021. Of the 34,234 patients included in the actual analyses, 3,582 (10%) were missing a recorded date of MM diagnosis. In these cases, dates of first anti-myeloma therapy were used as surrogates since most MMs are treated at diagnosis. OS was compared between MM patients with “known” vs. surrogate or “presumed” diagnosis dates. To assess how the inclusion of deaths at time zero may or may not affect OS, the data were first analyzed by excluding patients who died within 1 month of second primary malignancy (SPM) diagnosis, including secondary AML (sAML). A second analysis of the same sample added a constant (0.5) to all survival times, allowing for inclusion of patients who died within one month of diagnosis. Analyses were conducted with STATA Version 17.0 (College Station, TX). Results: Despite the strong, positive correlation between recorded MM diagnosis date and date of first anti-myeloma therapy, there was a statistically significant difference in survival of MM patients with a known vs. presumed date of diagnosis (median OS 115 vs. 45 months, HR 2.54, 95% CI 2.41-2.69, p< 0.001). Dropping vs. including deaths within one month of diagnosis resulted in a marked difference (i.e., nearly 1 year) in median OS from the date of diagnosis of any SPM (113 vs. 103.5 months) as well as sAML (41 vs. 30.5 months). Conclusions: Although RWD hold promise, oncologists must be aware of common pitfalls in survival analyses: missing data, variable surrogates, and deaths at time zero being dropped. Patients with a recorded date of MM diagnosis appear to be fundamentally different from those who don’t have a date of diagnosis but do have a date of anti-myeloma therapy recorded. For aggressive malignancies, excluding patients who died at time zero can lead to over-estimation of survival. Adding a small constant (0.5) to the time variable can enable the inclusion of patients who die quickly after their cancer diagnosis. In conclusion, when utilizing RWD to guide clinical decision making, it is important to be aware of common threats to data validity, which can produce misleading results.

Disclaimer

This material on this page is ©2024 American Society of Clinical Oncology, all rights reserved. Licensing available upon request. For more information, please contact licensing@asco.org

Abstract Details

Meeting

2023 ASCO Annual Meeting

Session Type

Publication Only

Session Title

Publication Only: Hematologic Malignancies—Plasma Cell Dyscrasia

Track

Hematologic Malignancies

Sub Track

Multiple Myeloma

Citation

J Clin Oncol 41, 2023 (suppl 16; abstr e20033)

DOI

10.1200/JCO.2023.41.16_suppl.e20033

Abstract #

e20033

Abstract Disclosures