Abstract:
Objective To systematically identify different subtypes of tandem duplication phenotype (TDP) in breast cancer and analyze their molecular characteristics and prognostic relevance.
Methods A total of 1 098 breast cancer samples from the Cancer Genome Atlas (TCGA) and 60 breast cancer cell lines from the Cancer Cell Line Encyclopedia (CCLE) were collected. Tandem duplication events were identified based on copy number variation (CNV) data, and TDP scores were calculated. Samples were classified into the TDP group (TDP score > −0.710 and number of TD events ≥20) and the non-TDP group (TDP score < −0.835 or number of TD events < 20). TDP samples were classified into six subtypes using a Gaussian mixture model: short-segment unimodal type (group 1), intermediate-segment unimodal type (group 2), long-segment unimodal type (group 3), and three bimodal mixed types composed of unimodal patterns (group 1/2, group 1/3, and group 2/3). Group 1, group 1/2, and group 1/3 were further categorized as the short-segment tandem duplication group (SSG), whereas group 2, group 3, and group 2/3 were categorized as the large-segment tandem duplication group (LSG). The CNV complexity, clinical characteristics, and prognosis of different TDP subtypes were analyzed, along with GO and KEGG functional enrichment analyses and drug sensitivity analysis. Survival analysis was performed using the Kaplan–Meier method, and differences between groups were compared using the Log-rank test.
Results A total of 147 TDP patients were identified in the TCGA database, including 25 cases in the SSG group (2, 1 and 22 cases in groups 1, 1/2, and 1/3, respectively) and 122 cases in the LSG group (2, 50 and 70 cases in groups 2, 3, and 2/3, respectively). The CNV complexity values in the non-TDP (409 cases), SSG, and LSG groups were 7.55 (7.20, 8.03), 8.47 (8.29, 8.78), and 8.37 (7.98, 8.65), respectively, with a statistically significant difference among the three groups (H=135.12, P<0.001). Functional enrichment analysis showed that the SSG was more likely to involve tumor suppressor genes and pathways such as DNA damage repair, whereas group 2/3 was mainly characterized by oncogene amplification and enrichment in tumor-related signaling pathways. In the CCLE cohort, 47 TDP strains were identified, including 40 strains in the SSG group (2, 37 and 1 strain in groups 1, 1/2, and 1/3, respectively) and 7 strains in the LSG group (0, 0 and 7 strains in groups 2, 3, and 2/3, respectively). The CNV complexity values in the non-TDP (12 strains), SSG, and LSG groups were 8.91 (8.76,9.07), 9.95 (9.78,10.28), and 9.82 (9.72,9.91), respectively, with a statistically significant difference among the three groups (H=28.86, P<0.001). Exploratory drug sensitivity analysis showed that the median IC50 values of LSG cell lines treated with 17-AAG and paclitaxel were higher than those of SSG cell lines. Survival analysis showed that the 5-year OS was 100.0% in the SSG and 80.5% in the LSG (95%CI: 71.3%–90.9%), with a statistically significant difference between 2 groups (χ2=4.90, P=0.027).
Conclusion This study identified TDP in breast cancer based on CNV data and classified it into six subtypes. Different TDP subtypes showed differences in CNV burden, driver gene amplification, functional pathways, drug sensitivity, and prognosis. LSG was characterized by oncogene amplification and poorer prognosis, suggesting that TDP classification may provide a reference for analyzing molecular heterogeneity and prognostic stratification in breast cancer.
Key words:
Breast neoplasms,
Tandem duplication phenotype,
Genomic heterogeneity
Meiheng Wang, Jie Li, Ceshi Chen, Chaohan Xu. Identification of tandem duplication phenotypes in breast cancer and molecular functional features[J]. Chinese Journal of Breast Disease(Electronic Edition), 2026, 20(03): 148-155.