Do publicly available OncoGenomic databases represent the population? A comparative analysis.

Authors

null

Danielle Brazel

UC Irvine Healthcare, Orange, CA

Danielle Brazel , Priyanka Kumar , David Joseph Benjamin , Justin Tyler Moyers

Organizations

UC Irvine Healthcare, Orange, CA, University of California, Irvine Medical Center, Orange, CA, Division of Hematology/Oncology, University of California, Irvine, Orange, CA, University of California Irvine Health, Orange, CA

Research Funding

No funding received

Background: Large publicly available databases are important repositories for analyses of clinicogenomic research used for identifying clinically relevant biomarkers. Diversity among individuals in these repositories is key for ensuring applicability of findings to patient populations. Methods: We compared two publicly available pancancer databases from academic institutions: The Cancer Genome Atlas (TCGA) and United States (US) institutions from the American Association for Cancer Research (AACR) Project GENIE version 11.0 (APG) with cancer incidence statistics from The US Cancer Statistics (USCS) in 2018, the most recently available data. We compared demographic data from key individual cancer types (Lung, Colorectal, Prostate, Breast, Gliomas, and Leukemias) for gender, race, and ethnicity. Frequencies are displayed as percentages and compared by Chi-Squared method. Results: The USCS includes 1,708,921 new cases in 2018 while the TCGA includes 12,958 cases and APG includes 109,041 cases. Women account for 49.5% of all cancer diagnosis and similarly 50.4% of all cases in AACR and 51.2% of TCGA cases. Table summarizes key demographic differences. Amongst all cancer types, 78% of all US cancers occur in White patients however 84% and 83% of patients were White in AACR and TCGA respectively. 16% of prostate cancer cases occur in Black patients, but only 9% (n = 328/3993) of AACR and 11% (n = 51/484) of TCGA cases were Black (p < 0.01). However, while Black patients are only 2% of all breast cancer diagnoses, they accounted for 9% (n = 841/9871) of AACR and 16% (n = 203/1343) of TCGA cases p < 0.01). Patients of Hispanic ethnicity were underrepresented amongst the population and all single tumor types with Hispanics accounting for 8% of cases in USCS but only 5% in the AACR and TCGA (p < 0.01). Pancreatic and lung cancers, which have historically short survivals, both had lower median ages at sequencing compared to the median age of diagnosis. Median age of sequencing for pancreatic cancers was 65 in both TCGA and AACR, while median age at diagnosis in US is 70. Median age at diagnosis of lung cancers is 71 years in US, however median age was 67 in both AACR and TCGA. Conclusions: Patients with advanced age and minority races are underrepresented in publicly available American databases. To make informed analyses from genomic databases, the diversity of the population must be reflected in these databases and efforts must be made to increase representation of underrepresented groups.


USCS
APG
TCGA
USCS
APG
TCGA
USCS
APG
TCGA
Cancer Type
All


Prostate


Pancreatic


Total
1,708,921
109,041
12,958
211,893
3,993
484
52,546
4,777
316
Male
51%
44%
50%
100%
99%
100%
52%
53%
55%
White
78%
84%
83%
77%
86%
86%
82%
87%
90%
Black
10%
7%
10%
16%
9%
11%
13%
5%
4%
Asian
3%
6%
6%
2%
3%
3%
4%
5%
6%
Native American
1%
0%
0%
0%
0%
0%
1%
0%
0%
Hispanic
8%
5%
5%
7%
4%
2%
9%
6%
4%
Median Age (y)
66
62
61
67
67
61
70
65
65

Disclaimer

This material on this page is ©2024 American Society of Clinical Oncology, all rights reserved. Licensing available upon request. For more information, please contact licensing@asco.org

Abstract Details

Meeting

2022 ASCO Annual Meeting

Session Type

Poster Session

Session Title

Health Services Research and Quality Improvement

Track

Quality Care/Health Services Research

Sub Track

Real-World Data/Outcomes

Citation

J Clin Oncol 40, 2022 (suppl 16; abstr 6588)

DOI

10.1200/JCO.2022.40.16_suppl.6588

Abstract #

6588

Poster Bd #

369

Abstract Disclosures

Similar Abstracts

First Author: J. Alberto Maldonado

Abstract

2024 ASCO Gastrointestinal Cancers Symposium

Comparative genomic analysis to identify a signature of sporadic colorectal cancer development in young adults.

First Author: Abedalrhman Alkhateeb

First Author: Parvathi Myer