Fifteen-MiRNA-Based Signature Is a Reliable Prognosis-Predicting Tool for Prostate Cancer Patients

Recurrence is a major problem for prostate cancer patients, thus, identifying prognosis-related markers to evaluate clinical outcomes is essential. Here, we established a fifteen-miRNA-based recurrence-free survival (RFS) predicting signature based on the miRNA expression profile extracted from The Cancer Genome Atlas (TCGA) database by the LASSO Cox regression analysis. The median risk score generated by the signature in both the TCGA training and the external Memorial Sloan-Kettering Cancer Center (MSKCC) validation cohorts was employed and the patients were subclassified into low- and high-risk subgroups. The Kaplan-Meier plot and log-rank analyses showed significant survival differences between low- and high-risk subgroups of patients (TCGA, log-rank P < 0.001 & MSKCC, log-rank P = 0.045). In addition, the receiver operating characteristic curves of both the training and external validation cohorts indicated the good performance of our model. After predicting the downstream genes of these miRNAs, the miRNA-mRNA network was visualized by Cytoscape software. In addition, pathway analyses found that the differences between two groups were mainly enriched on tumor progression and drug resistance-related pathways. Multivariate analyses revealed that the miRNA signature is an independent indicator of RFS prognosis for prostate cancer patients with or without clinicopathological features. In summary, our novel fifteen-miRNA-based prediction signature is a reliable method to evaluate the prognosis of prostate cancer patients.


Introduction
Prostate cancer (PCa) causes a heavy health burden for men around the world, accounting for more than 350,000 cancer-specific deaths, and it was the fifth leading cause of cancer in 2018 [1]. Approximately 5% of PCa patients have advanced type PCa or metastases at the time of diagnosis [2]. For advanced PCa, initial hormonal therapy can inhibit the progression of PCa in most patients by decreasing the function of the testosterone and androgen receptor (AR) signaling pathway; however, these patients tend to develop castration-resistant prostate cancer (CRPC), and approximately 10-20% of these patients arrive at this stage in the first five years after diagnosis [3,4]. For patients over 75-years-old, the incidence rate of metastasis increases to 48% at the time of diagnosis, and PCa-specific death is also as high as 53% [5].
A class of non-coding RNAs that are 17-25 bp in length that could impact protein-encoded genes through binding to the 3' untranslated region of mRNA are named as microRNAs (miRNAs). Approximately half of the mRNAs' expression could be modified with miRNAs through this mechanism [6][7][8][9][10][11]. Numerous articles have reported that miRNAs could participate in the process of normal cell physiology, including cell differentiation, apoptosis, Ivyspring International Publisher proliferation, and cell cycle arrest [12][13][14]. In addition, miRNAs are also involved in the extensive process of tumorigenesis in various cancers, such as pancreatic cancer, brain glioma, cervical cancer and prostate cancer [15][16][17][18][19][20]. With the application of gene sequencing in tumors, miRNAs have been considered novel biomarkers in the prediction of prognosis and drug resistance [21][22][23][24][25], and integrating multiple miRNAs could be more effective and lead to a better prediction than single miRNAs [16,23,[26][27][28][29].
In the current study, our purpose was to establish a multiple-miRNA-based signature to predict the recurrence risk of PCa patients, to help the clinician develop treatment plan.

Datasets Downloaded and Identification of Candidate miRNAs
The miRNA profile and clinical information of PCa patients from TCGA database was download through UCSC Xena HUGO probeMap (https:// xenabrowser.net/datapages/). The expression level of miRNA is described as log 2 (RPM+1). A total of 491 patients with recurrence-free survival (RFS) information were selected. The MSKCC cohort, also known as GSE21032, was downloaded from the Cbioportal (http://cbio.mskcc.org/), which included miRNA profile (normalized log2 miRNA expression data), and matched clinicopathological features of 105 PCa patients. MSKCC cohort was prepared as the external validation dataset.

Establishment of the Prognosticate Signature
In order to find out the most suitable miRNA candidates that could be used in different cohorts. Univariate Cox regression analysis was performed to present the RFS-related miRNA candidates in the TCGA cohort, with a cut-off of P value less than 0.01. LASSO Cox regression test was performed to select the RFS-related miRNA candidates for the prognostic signature [30]. Then, with a linear combination of the coefficients (β) derived from the LASSO Cox regression model combined with the expression of miRNA candidates, a miRNA-related RFS predicting signature was established. The risk score = (βmiRNA#1 * expression level of miRNA #1 ) + (β miRNA#2 * expression level of miRNA #2 ) + (β miRNA#3 * expression level of miRNA #3 ) + ⋯ + (β miRNA#n * expression level of miRNA #n ). Then, with the median risk score calculated above, the patients were assigned into two subgroups which represented the risk level.

Kaplan-Meier and Receiver Operating Characteristic (ROC) Curve
Kaplan-Meier curves were performed to compare the RFS outcome between the high-and low-risk patients determined by the miRNA-based signature. The ROC curve was plotted with the R package "pROC". The predictive value was assessed by the area under the ROC curve (AUC). In addition, nomogram ROC analysis was used to synthesize the miRNA-based signature with the clinicopathological features, as well as the laboratory test results. In addition, the Kaplan-Meier plot and log-rank analyses were used to determine the survival difference between high-and low-risk subgroups, and the univariate Cox analysis was used to generate the hazed ratio (HR) and 95% confidential interval (95% CI) among high-and low-risk subgroups.

Target Gene Pathway Enrichment Analyses and Network Construction
miRNA target genes were predicted based on the R package "multiMiR", which combines the predict results from miRecords, miRtarbase, and Tarbase databases [31][32][33]. Pathway enrichment analysis was performed using the R package "clusterProfiler". The downstream genes, which were targeted by at least eight miRNAs were enrolled to establish the miRNA-mRNA interaction network, which was subsequently displayed by the Cytoscape software (San Diego, CA, USA) [34].

Characteristics of enrolled patients
The miRNA expression profile derived from the TCGA dataset was set as the training cohort, while the external validation cohort was extracted from the MSKCC database. The detailed clinicopathological features of two cohorts were presented in Table 1. In addition, we also performed hierarchical grouping analyses for the TCGA cohort. According to a five-tiered risk features provided by the Cambridge Prognostic Group (CPG) classification for nonmetastatic PCa patients, the patients were assigned to TCGA-group-1 and TCGA-group-2 subgroups [35] ( Table 1).

Establishment of the Fifteen-MiRNA-Based Prognosticated Classifier by LASSO Cox Regression Analysis
First, using the TCGA miRNA data matrix (491 patients with available RFS information), univariate Cox regression analyses were performed for each miRNA individually to screen out the RFS-related miRNAs. A total of 28 miRNAs that were proved associated with the RFS of PCa patients were obtained (P < 0.05, Table S1). Then, the fifteen-miRNA-based RFS predicting signature was established according to the co-ef of each miRNA candidate by LASSO Cox regression analysis ( Fig. 1A-B, and Fig. S1) [36]. The recurrence associated-risk score hsa-miR-144-3p * 0.100417278760513, according to which, all patients were separated into low-or high-risk groups referring to the median risk score in both the TCGA training and MSKCC validation cohorts. Further analyses would be performed based on these results.

Testing the Prognosticate Value and Accuracy of the Fifteen-miRNA-Based RFS Predicting Signature
To assess whether the signature could effectively predict the RFS of PCa patients, we performed the Kaplan-Meier and log-rank tests on the TCGA training cohort and MSKCC validation cohort. The patients in the TCGA cohort with higher scores showed unfavorable RFS than those with lower scores (HR = 4.62, 95%CI: 2.76-7.734, P < 0.001, Fig. 1C). Besides, the Kaplan-Meier analysis was performed on the external MSKCC validation cohort, and similar results were obtained (HR = 2.35, 95%CI: 1.021-5.409, P = 0.045, Fig. 1E). In addition, the predictive value of the fifteen-miRNA-based signature was determined by ROC analysis, and the AUC values of the training and internal validation cohorts confirmed the high prognosticate value of this signature in predicting RFS of PCa (TCGA training set, AUC = 0.756, 95%CI = 0.702-0.810, Fig. 1D; and MSKCC validation set, AUC = 0.679, 95%CI: 0.559-0.799, Fig. 1F).

Validation in Subgroup of Patients Stratified by Hierarchical Grouping
To further validate the prognostic value of our newly established miRNA-based signature, we stratified the patients from the TCGA cohort by hierarchical grouping analyses. The patients were subclassified into two similar groups based on the clinicopathological features, as well as laboratory data ( Table 1). Subsequently, the Kaplan-Meier and ROC analyses were performed to validate the significance and stability of this signature (Fig. 2). We found that the signature significantly discriminated high-and low-risk PCa patients in the TCGA-Group-1 (HR = 5.16, 95%CI: 2.451-10.858, P < 0.001) and TCGA-Group-2 (HR = 4.55, 95%CI: 2.267-9.117, P < 0.001), and also obtained moderate predicting efficacy in both groups (TCGA-Group-1: AUC = 0.788, 95%CI: 0.713-0.862; TCGA-Group-2: AUC = 0.719, 95%CI: 0.640-0.799). All these results proved the stability of the current signature.

Target Gene Prediction, Network Construction, and Pathway Enrichment
We predicted miRNA target genes using miRecords, miRtarbase, and Tarbase databases, and consequently, filtered out genes that were targeted by at least eight miRNAs and enrolled to construct the miRNA-mRNA network (Fig. 3, and Table S2). These miRNAs potentially interact with their downstream genes to influence the tumorigenesis and progression of PCa. In addition, functional enrichment analyses were performed for these downstream genes. The GO-biological process (BP, CC, and MF) revealed that the targeted genes were enriched in Proteasomal protein catabolic process, Autophagy, Regulation of apoptotic signaling pathway, Histone modification, Cell-substrate junction, Focal adhesion, Chromosomal region, Ubiquitin-like protein transferase activity, Cadherin binding, etc. (Fig. 4, and Table S3). KEGG analysis found that these targeted genes were enriched in Cellular senescence, Proteoglycans in cancer, p53 signaling pathway, etc., while significant enrichment of E2F_Targets, G2M_Checkpoint, mTORC1_Signaling, MYC_Target_V1 pathway was revealed by Hallmark pathway enrichment analysis (Fig. 4, and Table S4). Highlighted, GSEA analysis was employed to compare the differences between high-and low-risk groups, and the differences were majorly enriched in Cell Cycle, Oocyte Meiosis, Homologous Recombination, DNA Replication and P53 Signaling Pathways (Fig. 5, and Table S5), which were proved by plenty of works that significantly associated with tumor proliferation and drug resistance.
We further performed stratified survival analyses to evaluate the prognosticate values of our risk model in different subgroups. According to Fig. 7, in general, the miRNA signature could distinguish the recurrence risk for different age populations (≤ 60 or > 60) and Gleason score populations (≤ 7 or > 7). Although the novel miRNA signature could only indicate the recurrence risk in patients whose PSA was lower than 10 ng/ml (P < 0.001) subgroup, the tendency was satisfied in PSA > 10 ng/ml subgroup, and the failed statistical analyses were potentially due to limited sample size.

Discussion
In the United States, PCa accounts for the majority in newly diagnosed cases and ranks as the third-leading cause of cancer-specific death [37]. At an early stage, most PCa patients could benefit from curative treatment and acquire a relatively good prognosis; however, advanced PCa patients tend to have poor prognoses due to recurrence or distant metastases [38]. Thus, the identification of biomarkers to predict the prognosis of PCa patients is warranted, which could clinically benefit the decision-making process. Nevertheless, few effective biomarkers that could be used for prognosis prediction are currently available. It is clear that miRNAs play pivotal roles among various cellular processes and are also involved in the initiation and progression of various cancers [39]. Here, we established and validated a stable fifteen-miRNA-based signature that could be used to predict the prognosis of PCa patients. The results showed that this classifier effectively assigns PCa patients into high-or low-risk groups, and found that the patients categorized in the high-risk group were more likely to have unfavorable RFS rate. These results were confirmed in the external MSKCC validation cohort. The multivariate analysis found the classifier was an independent risk factor for recurrence of PCa, and even better than Gleason score, PSA level, and pathological T grade. Nomogram analyses found that the classifier adds value to the currently available staging system. In addition, the targeted genes were predicted by three online databases, and the functions of these downstream genes were annotated by GSEA, KEGG, and GO analyses. All these results prove the clinical significance of the miRNA-based signature and predict the underlying mechanisms of how these miRNAs influence the tumor progression.
He et al. [40] found that the increased expression of hsa-miR-21-5p suppresses the proliferation and migration of colon cancer cells in vivo and in vitro, which means the overexpression of hsa-miR-21-5p may promote the prognosis of cancer patients. Elevated expression of hsa-miR-222-3p has been reported in diverse cancers, and it promotes the proliferation, invasion, metastasis, and immune escape of cancer cells [41]. Cheng et al. [42] found that hsa-miR-222-3p serves as an independent risk factor in the recurrence of prostate cancer. Fang et al. [43] found that high expression of hsa-miR-582-3p is associated with unfavorable prognosis of lung cancer patients, and it maintains the stem cell-like traits of lung cancer by activating Wnt/β-catenin signaling. Chen et al. [44] established a miRNA-based prognostic signature and the higher expression of hsa-miR-326 is linked to unfavorable overall survival among nonsmall cell lung cancer (NSCLC) patients. Zou et al. [45] demonstrated that suppressing miR-192-5p expression regulates lung cancer cell proliferation, migration, and invasion by negatively regulating TRIM44 expression. Similar results are obtained by Zheng et al. [46] that miRNA-192 plays an oncogenic role in colon cancer, and simvastatin inhibits cancer cell growth by activating miR-192. Li et al. [47] found that downregulation of miR-181a-5p is observed in aggressive breast and colon cancers, and it inhibits the migration and angiogenesis via downregulation of MMP14; furthermore, the role of miR-181a-5p is consistent in prostate cancer study [48]. For miR-144-3p, researchers found that it promotes the growth and metastasis of papillary thyroid carcinoma by targeting the PAX8 gene [49]. While for other candidates, few studies report their expression and function in cancers, or their prognosis effects are not completely consistent with previous publications.
Highlighted, it is interesting to explore the differences between high-and low-risk PCa populations. GSEA analysis suggested that the differential expressed genes were mostly enriched in Cell Cycle, Oocyte Meiosis, Base Excision Repair, Homologous Recombination, DNA Replication, Spliceosome, Nucleotide Excision Repair, and P53 Signaling Pathways. Cancer is demonstrated as uncontrolled cell proliferation attributing to aberrant activities of various cell cycle-related proteins; thus, regulators of the cell cycle process are considered as therapeutic targets in cancers [50]. For example, PARP inhibitors are highly successful used in treating BRCA1/BRCA2-mutant tumors. DNA replication has been proved to play a critical role in tumor cell proliferation. Mutations in DNA replication genes could cause hereditary forms of colorectal, breast, ovary, and skin cancers [51][52][53][54]. P53 is regarding as an extraordinary multifunctional protein, which participates in the regulation of cell cycle, differentiation, immune response, DNA repair, etc. [55][56][57][58][59] In addition, the localization of p53 at active replication forks and the p53-dependent effects on DNA elongation indicates that the p53 protein involves in the DNA replication process [60,61]. Combined these results, dysregulation of DNA replication and cell cycle processes were potential causes of tumor progression. For other significantly enriched pathways, many reports also indicated their significance in cancer progression. Our results present a novel genetic aspect of prostate cancer recurrence, which will benefit to the personalized treatment for these patients.

Conclusion
Overall, we establish a novel fifteen-miRNAbased RFS prediction signature for PCa patients, which can successfully classify PCa patients into lowand high-risk groups and predict their prognosis. The novel miRNA signature is a reliable tool to assess the prognosis of PCa patients. Currently, our center is ongoing to collect PCa tissues and serum samples and would validate these findings in the near future. manuscript.