Home    中文  
 
  • Search
  • lucene Search
  • Citation
  • Fig/Tab
  • Adv Search
Just Accepted  |  Current Issue  |  Archive  |  Featured Articles  |  Most Read  |  Most Download  |  Most Cited

Chinese Journal of Breast Disease(Electronic Edition) ›› 2026, Vol. 20 ›› Issue (03): 138-147. doi: 10.3877/cma.j.issn.1674-0807.2026.03.002

• Original Article • Previous Articles    

Construction and evaluation of a prognostic prediction model for breast cancer based on RNA-binding protein genes expression signatures

Caixia Ding1, Jinghui Qu1, Yinghong Pei1, Jingna Li1, Xiaoyu Zheng2, Lingzhi Xu3,(), Sisi Li1,()   

  1. 1 Department of Pathology, Harbin Medical University Cancer Hospital, Harbin 150081, China
    2 Department of Anesthesiology, Harbin Medical University Cancer Hospital, Harbin 150081, China
    3 Department of Breast Oncology, Second Affiliated Hospital of Dalian Medical University, Dalian 116023, China
  • Received:2025-12-14 Online:2026-06-01 Published:2026-06-11
  • Contact: Lingzhi Xu, Sisi Li

Abstract:

Objective

To screen for differentially expressed RNA-binding protein genes (RBPs), construct a prognostic prediction model combined with risk score and clinicopathological characteristics of patients, validate it, and analyze the immunophenoscore (IPS) and drug sensitivity in different risk groups.

Methods

Transcriptomic and clinical data from The Cancer Genome Atlas (TCGA) breast cancer cohort (1 106 breast cancer tumor samples and 137 adjacent normal samples) were collected as the training set, and the GSE86166 dataset (containing 330 breast cancer samples) was used as the validation set. Differentially expressed RBPs between tumor samples and adjacent normal samples were screened in the training set. Univariate Cox proportional hazards regression and least absolute shrinkage and selection operator (LASSO) regression analyses were performed to select core RBPs and construct a prognostic risk score model. Breast cancer patients were divided into high-risk group (649 cases) and low-risk group (457 cases) based on the risk score cut-off value. Kaplan-Meier survival analysis and receiver operating characteristic (ROC) curves were used to evaluate model performance. External validation was conducted in the validation set samples (high-risk group 161 cases and low-risk group 169 cases) using the same risk score formula and cut-off value. In the TCGA training set, univariate and multivariate Cox proportional hazards regression analyses combined with patients clinicopathological characteristics were used to evaluate the independent prognostic value of the risk score. A prognostic model was constructed based on clinicopathological characteristics and the risk score, with calibration curves used to assess its accuracy and decision curve analysis (DCA) used to evaluate its clinical utility. IPS was used to assess the tumor immunophenotype characteristics of the high and low risk groups. The half maximal inhibitory concentration (IC50) was used to evaluate the drug sensitivity of 296 commonly used clinical chemotherapeutic and targeted therapeutic drugs in the high and low risk groups. Using convenience sampling, 10 pairs of breast cancer tissue samples and corresponding adjacent normal tissue samples from Harbin Medical University Cancer Hospital collected between January 2023 and December 2025 were used to validate the expression differences of the 5 core genes at the protein level using histochemistry score(H-score).

Results

A total of 126 differentially expressed RBPs were identified from 1 106 breast cancer tumor samples and 137 adjacent normal samples. Univariate Cox proportional hazards regression analysis and LASSO regression analysis ultimately identified 5 core RBPs (NUAK2, ACSL1, MAP1LC3C, WT1, and MYOCD), based on which a prognostic risk score model was established. Kaplan-Meier survival analysis showed that the median overall survival (OS) of patients in the high-risk group and low-risk group in the training set was 97.5 months (95%CI: 90.2-104.8) and 216.6 months (95%CI: 198.3-234.9), indicating a statistically significant difference (χ2=13.20, P<0.001) ; The median OS of patients in the high-risk group and low-risk group in the validation set was 76.8 months (95%CI: 70.5-83.1) and 182.4 months (95%CI: 165.7-199.2), indicating a statistically significant difference (χ2=4.14, P=0.042). ROC curve analysis showed that the area under the curve at 3, 5, and 7 years OS for the training and validation sets were 0.60 (95% CI: 0.54-0.66), 0.60 (95%CI: 0.53-0.67), 0.65 (95%CI: 0.59-0.71), and 0.64 (95%CI: 0.58-0.70), 0.60 (95%CI: 0.54-0.66), 0.62 (95%CI: 0.56-0.68), respectively, indicating that the model has prognostic predictive value in both the training and external validation sets. Multivariate Cox proportional hazards regression analysis showed that the risk score was an independent factor predicting overall survival (HR=6.807, 95%CI: 3.940-11.715, P<0.001). Calibration curves showed that the concordance index (c-index) of predicting prognostic model at 3, 5, and 7 years OS in breast cancer patients were 0.782, 0.765, and 0.748, respectively (χ2=8.62, 9.15, 7.89, all P>0.05), confirming the stable predictive performance of the model. DCA results showed that, within the clinical decision threshold interval of 0.153-0.604, the prognostic model provided a better net clinical benefit than both the treat-all and treat-none strategies.Tumor immunogenicity and immunotherapy response analysis showed that the IPS of the low-risk group was significantly higher than that of the high-risk group (all P<0.05). Drug sensitivity analysis showed that 146 drugs had lower IC50 values in the low-risk group than in the high-risk group (all P<0.05), while 20 drugs had lower IC50 values in the high-risk group than in the low-risk group (all P<0.05). The IC50 values of seven classical chemotherapeutic drugs (paclitaxel, doxorubicin, carboplatin, oxaliplatin, cyclophosphamide, docetaxel and topotecan) were significantly lower in the low-risk group than in the high-risk group (all P<0.001). Protein validation results showed that the expression of NUAK2 (152.00±17.51 vs 16.00±13.08, t=16.60, P<0.001) and WT1 [35.00 (15.00, 72.50) vs 7.50 (1.75, 30.00), Z=−2.80, P=0.005] were higher in tumor tissues than in adjacent normal tissues, whereas the expression of MAP1LC3C (49.20±44.90 vs 128.00±37.06, t=-4.61, P=0.001), ACSL1 [145.00 (75.00, 187.50) vs 270.00 (247.50, 273.75), Z=−2.81, P=0.005], and MYOCD [100.00 (47.50, 140.00) vs 160.00 (150.00, 165.00), Z=−2.82, P=0.005] were lower in tumor tissues than in adjacent normal tissues.

Conclusion

In this study, the prognostic prediction model for breast cancer constructed based on 5 core RBPs has good predictive efficacy, and accordingly different risk groups show significant difference in IPS and drug sensitivity.

Key words: Breast neoplasms, RNA-binding protein genes, Prognostic prediction model, Immunophenoscore, Drug sensitivity

京ICP 备07035254号-13
Copyright © Chinese Journal of Breast Disease(Electronic Edition), All Rights Reserved.
Tel: 0086-10-51322630 E-mail: jcbd@medmail.com.cn
Powered by Beijing Magtech Co. Ltd