Development and validation of a molecular prognostic index of bladder cancer based on immunogenomic landscape analysis

Background Bladder cancer (BCa) is one of the important tumors that have been proven to be treatable with immunotherapy. This study aims to identify and validate a molecular prognostic index of BCa based on immunogenomic landscape analysis. Methods The cancer genome atlas (TCGA) database and immunology database and analysis portal (ImmPort) database were used to identified differentially expressed immune-related genes (IRGs). Prognostic IRGs were screened and protein–protein interaction (PPI) network was constructed. Multivariate Cox analysis was performed to develop a molecular prognostic index of BCa. Internal and external validation were then performed in TCGA cohort and GEO cohort, respectively. Besides, we also explore the relationship between this index and clinical characteristics, immune cell infiltration and tumor microenvironment. Results A total of 61 prognostic IRGs were identified and a molecular prognostic index was developed. The top four hub genes included MMP9, IGF1, CXCL12 and PGF. The difference in overall survival between high-risk group and low-risk group was statistically significant. The area under curve of the receiver operating characteristic (ROC) curve was 0.757, suggesting the potential for this index. Besides, Internal validation using TCGA cohort and external validation using GEO cohort indicated that this index was of great performance in predicting outcome. T cells CD8, T cells CD4 memory activated, T cells follicular helper, macrophages M0, macrophages M2 and neutrophils were significantly associated with prognosis of BCa patients. Female, high grade, stage III&IV, N1-3 and T3-4 were associated significantly with higher risk score compared with male, low grade, stage I&II, N0 and T1-2, respectively. High risk score had a positive association with higher stromal score and ESTIMATE score while high risk score had a negative association with tumor purity. Conclusions This study identified several prognostic immune-related genes of clinical value. Besides, we developed and validated a molecular index based on immunogenomic landscape analysis, which performed well in predicting prognosis of BCa. Further researches are needed to verify the effectiveness of this index and these vital genes.

However, most patients require postoperative adjuvant therapy according to latest guideline [4]. The application of adjuvant chemotherapy significantly improved prognosis in patients with BCa [4,5]. Nowadays, immunotherapy is considered as a nonnegligible treatment for solid malignancies by strengthening the immune system against tumors [6,7]. BCa is one of the important tumors that have been proven to be treatable with immunotherapy [6]. Since 1976, intravesical instillation of Bacillus Calmette-Guérin (BCG) has been widely used in the treatment of BCa [8]. Recently, with the introduction of checkpoint inhibitors into clinical practice, immunotherapy plays a more important role in anti-tumor therapy in BCa patients, particularly those who were refractory to conventional treatment [9]. Therefore, it is of great importance to explore the immune components and relevant mechanisms.
This study aimed to identified the immune-related genes (IRGs), especially prognostic IRGs, in BCa microenvironment using bioinformatics methods. We also explored the underlying clinical application of IRGs on prognostic stratification. Importantly, we constructed a molecular prognostic index based on these IRGs and explored the relationship between the prognostic index and immune cell infiltration, clinical characteristics and tumor microenvironment.

Data acquisition
We downloaded transcriptome data and clinical data of 412 BCa samples and 19 normal samples from the Cancer Genome Atlas (TCGA) database (https ://tcga-data. nci.nih.gov/tcga/). Besides, external validation data were extracted from GSE19423 and GSE32894 dataset in Gene Expression Omnibus (GEO) database (https ://www.ncbi. nlm.nih.gov/geo/). The immune-related genes (IRGs), which have been confirmed to play a vital role in immune activity, were identified from Immunology Database and Analysis Portal (ImmPort) database (https ://www.immpo rt.org/).

Identification of differentially expressed IRGs
The transcriptome data from TCGA database and GEO database was analyzed using R x64 3.6.1 software (https ://www.r-proje ct.org/). The R package limma and Wilcox test were applied to filtrate the differentially expressed IRGs for further analysis. The cut-off value was false discovery rate (FDR) < 0.01 and log2|fold change (FC)| > 1. Importantly, univariate Cox regression analysis was used to extract prognosis-associated differentially expressed IRGs of BCa, and P < 0.05 was considered statistically significant.

Functional analysis of differentially expressed IRGs
Gene ontology analysis (GO) is applied to annotate differentially expressed IRGs. The results of GO analysis were presented by three parts including biological processes (BP), molecular functions (MF), and cellular component (CC). Besides, the Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis was used to perform the pathway enrichment analysis. Both GO analysis and KEGG analysis were conducted using R x64 3.6.1 software.

Construction of protein-protein interaction (PPI) network and hub genes selection
In this step, we constructed PPI network of prognosisassociated differentially expressed IRGs that have been identified in previous analysis. Search Tool for the Retrieval of Interacting Genes (STRING) database (version 11.0; https ://strin g-db.org/cgi/input .pl) was used to evaluate the PPI information. Cytoscape software (version 3.6.1) was used to visualize the PPI networks and select hub genes for further discussion.

Development of the IRGs-based prognostic index
By using multivariate Cox regression analysis, we established a prognostic index based on these differentially expressed IRGs, which has significant association with the survival of BCa patients. Finally, patients were divided into two groups, high-risk group and low-risk group, according to median value of the risk score. Survival analysis and the receiver operating characteristic (ROC) curve was performed to validate the performance of the index. Besides, independent prognostic analysis was used to evaluated whether this index is an independent prognostic factor of overall survival (OS).

Internal and external validation of the IRGs-based prognostic index
We validated this prognostic index in external independent database. We also selected survival-related IRGs from GEO database. The index was then validated externally in GEO cohort. The patients were also divided in high risk group and low risk group. And the survival analysis was performed to conform the usability of this prognostic index. Furthermore, in order to perform internal validation of this index, the patients from TCGA cohort were randomly divided into train group and test group. We then performed survival analysis in train group and test group, respectively.

Evaluation of relationship between this prognostic index and immune cell infiltration, clinical characteristics and tumor microenvironment
The Tumor Immune Estimation Resource (TIMER) database version 2.0 (https ://cistr ome.shiny apps.io/ timer /) was used to estimate the relationship between this index and 22 subtypes of tumor-infiltrating immune cells. Besides, we also explore the relationship between this index and clinical characteristics obtained from TCGA databases including age, gender, grader, stage, T, N and M stage. Tumor microenvironment has been regarded as an important factor which plays a vital role in carcinogenesis. ESTIMATE was an algorithm for estimating immune score, stromal score and tumor purity in tumor microenvironment [10]. In this study, we calculate immune score, stromal score and tumor purity score using ESTIMATE algorithm to explore the relationship between this index and tumor microenvironment. P value < 0.05 was considered statistically significant. Statistical analyses were performed using R software.

Identification of differentially expressed IRGs
A total of 412 BCa samples and 19 normal samples from TCGA were included in this study and 4876 differentially expressed genes (DEGs) between BCa tissue and normal tissue were identified. The clinicopathological characteristic of 412 patients with BCa were showed in Table 1. Then, 2498 IRGs were extracted from ImmPort database, among which 260 differentially expressed IRGs were filtrated for further analysis. The flow diagram of this study was showed in Fig. 1.

Functional analysis of differentially expressed IRGs
GO analysis related to BP revealed that these differentially expressed IRGs were mainly involved in positive regulation of response to external stimulus. GO analysis related to CC showed that these differentially expressed IRGs were mainly enriched in extracellular matrix and receptor complex. GO analysis related to MF demonstrated that these differentially expressed IRGs were involved in receptor ligand activity (Fig. 2a). KEGG analysis showed that cytokine-cytokine receptor interaction was the most important pathway (Fig. 2b).

Identification of prognostic IRGs
A total of 61 prognostic differentially expressed IRGs were identified using univariate Cox model (P < 0.05). There is a significant correlation between these 61 genes and OS; therefore, they were extracted for further study. The most significant genes were presented in Fig. 3.
According to the forest plot of hazard ratios, most of these genes were risk factors for poor prognosis in bladder cancer patients. That is to say, the higher the expression of these genes which were presented by red node, the higher the probability of poor prognosis.

Construction of PPI network based on prognostic IRGs
Proteins related to prognostic differentially expressed IRGs were selected based on STRING and visualized by Cytoscape version 3.6.1 (Fig. 4a). Furthermore, in this network, the top four genes with highest degree scores were selected as hub IRGs including matrix metallopeptidase 9 (MMP9), insulin like growth factor 1 (IGF1),  . 1 The flow diagram of this study C-X-C motif chemokine ligand 12 (CXCL12) and placental growth factor (PGF) (Fig. 4b).

Development of a molecular prognostic index
Then, we developed a molecular prognostic index based on these differentially expressed prognostic IRGs (Table 2). The difference in overall survival between high-risk group and low-risk group was statistically significant (P < 0.05) (Fig. 5a). The area under curve of ROC was 0.757, suggesting the potential for the prognostic index (Fig. 5b).

Internal and external validation of the IRGs-based prognostic index
GEO database was used for an external validation. The difference in OS between high-risk group and low-risk group in GEO cohort was also statistically significant (P < 0.05, Fig. 6a). Internal validation was performed in train group and test group, respectively. The difference in OS between high-risk group and low-risk group was statistically significant both in train group (P < 0.05, Fig. 6b) and test group (P < 0.05, Fig. 6c).

Prognostic evaluation and independent prognostic analysis
The distribution of risk score and survival time were demonstrated in Fig. 7a, b. The expression heatmap of this index was demonstrated in Fig. 7c. Patients with higher risk score were associated significantly with poor prognosis. Univariate and multivariate independent prognostic analysis showed that the risk score was the only independent predictor for bladder cancer, indicating the great performance of this index (P < 0.05, Fig. 8 and Table 3). Female, high grade, stage III and IV, N1-3 and T3-4 were associated significantly with higher risk score compared with male, low grade, stage I and II, N0 and T1-2, respectively (P < 0.05, Fig. 9 and Table 4).

Relationship of the prognostic index with immune cell infiltration and tumor microenvironment
Among the 22 subtypes of tumor-infiltrating immune cells in TIMER version 2.0 database, higher infiltrating percentage of macrophages M0, macrophages M2 and neutrophils were significantly associated with the poor prognosis of BCa, while lower infiltrating percentage of T cells CD8, T cells CD4 memory activated or T cells follicular helper were significantly associated with the poor prognosis of BCa according to the risk score derived from the molecular prognostic index (P < 0.05, Fig. 10a).
Using ESTIMATE algorithm, we found that high risk score had a positive association with higher stromal score and ESTIMATE score while high risk score had a negative association with tumor purity. However, The correlation between immune score and this prognostic index was not significant (P < 0.05, Fig. 10b-e).

Discussion
As one of the important tumors that could benefit from immunotherapy, it is of great importance to explore the specific mechanism involved in immunotherapy of BCa [11,12]. Although there have been several prognostic models predicting the survival outcome of BCa demonstrated by previous study, few studies focus on the immune-related genes related to survival [13,14]. This study comprehensively utilized multiple online databases to identify prognostic differentially expressed IRGs that played a vital role in survival outcome in BCa patients and explored the associated mechanism.
According to the result of univariate Cox analysis, we identified a total of 61 prognostic IRGs, among which the top 4 IRGs (including MMP9, IGF1, CXCL12 and PGF) with greatest degree score in PPI network might be of great significance for patients with BCa. MMP9, a member of the matrix metalloproteinase gene family, could be expressed to dissolve the extracellular matrix components. Fouad et al. [15] revealed that MMP-9 was significantly up-regulated in both blood and urine of BCa patients and was of great discriminatory ability in the diagnosis of BCa. Wong et al. [16] also demonstrated that MMP9 was a potential therapeutic target and prognostic biomarker that contribute to the progression of BCa. There have been several studies confirming the role of IGF-1, an anti-apoptotic peptide, in the progression of BCa. Hursting et al. [17] showed that IGF-1 pathway plays a vital role in bladder carcinogenesis in transgenic mice. Long et al. [18] reported that IGF-1/ERβ signalling plays an important role in promoting cisplatin resistance in BCa cells. However, the role of plasma IGF1 in assessing bladder cancer risk remains controversial. Zhao et al. [19] found that the plasma concentrations of IGF1 was significantly higher in BCa patients and associated with an increased risk of BCa. Nevertheless, recently, a prospective study indicated there was no association between the risk of BCa and IGF1 level in plasma [20]. This study identified IGF1 as an IRG for the first time, which might play an important role in immune response in BCa progression. The role of CXCL12 in BCa has been extensively described. It has been reported that CXCL12/CXCR4 axis plays an important role in tumor angiogenesis, and protein and mRNA levels of CXCL12 are associated with human BCa progression [21]. Batsi et al. [22] revealed that CXCL12 expression has positive association with tumor grade, irrespective of primary BCa or recurrent BCa. Yang et al. [23] demonstrated in their study that the expression of CXCR4 and its ligand CXCL12 might be in connection with depth of invasion and differentiation degree in BCa. Previous studies have showed that PGF contribute greatly to tumor growth and metastasis. Soukup et al. [24] indicated that the urine and plasma concentration of PGF were significantly increased in patients with BCa compared with those without BCa. Loredana et al. [25] summarized that PGF played a vital role in regulating tumor immune microenvironment and promoting tumor immune escape. In this study, from the perspective of tumor immunity, we identified MMP9, IGF1, CXCL12 and PGF as prognostic differentially expressed IRGs for the first time. Therefore, further study is required to explore the role of MMP9, IGF1, CXCL12 and PGF in the immune-related mechanisms and immunotherapy of BCa.
Most importantly, we developed a prognostic index based on immune-related genes, which might contribute to further understanding the specific mechanism of the effectiveness of immunotherapy and predicting   [27] identified an index integrating clinical information, mRNA and miRNA for bladder urothelial carcinoma, whose AUC was distinctly increased compared with that of the RNA-alone index or clinical-alone index. Dyrskjøt et al. [28] prospectively validated a 12-gene   progression score for non-muscle invasive bladder cancer (NMIBC) and found that the prognostic power of this score was superior to histopathological parameters or clinical data. Ingelmo-Torres et al. [29] constructed a predicting model based on two urinary cell microRNAs, miR-140-5p and miR-92a-3p. In this study, we developed a new prognostic index. The overall survival of patients with low risk was significantly increased compared with those with high risk. The area under curve of ROC was 0.757, suggesting the potential for this prognostic index. Further, we performed internal validation using train group and test group in TCGA cohort and external validation in GEO cohort. Both internal validation and external validation suggested the predictive power of this index. Univariate and multivariate independent prognostic analysis demonstrated that the risk score was the only independent predictor for bladder cancer, indicating the great performance of this index. Besides, we investigated whether this index is related to immune cell infiltration, and found that higher infiltrating percentage of macrophages M0, macrophages M2 and neutrophils were significantly associated with the poor prognosis of BCa, while lower infiltrating percentage of T cells CD8, T cells CD4 memory activated or T cells follicular helper were significantly associated with the poor prognosis of BCa. It is reported that M2 macrophages play a vital role in immune responses induced by BCG against BCa [30]. Qiu et al. [31] demonstrated the regulation role of tumor-associated macrophages in BCa cell growth. Xue et al. [32] also indicated that the infiltration of M2 macrophage might be an underlying target of immunotherapy for BCa patients. Previous studies have demonstrated that tumor-infiltrating T cell landscape in bladder cancer would contribute to management decisions making, particularly immunotherapy [33]. Hou et al. [34] also revealed that the expression of PD-1 in T cell subsets provided important prognostic information in patients with BCa. Furthermore, relationship between this index and clinical characteristics were also evaluated. We found that female was associated significantly with higher risk score compared with male. It is reported that women are usually diagnosed with more advanced BCa and experience higher cancer-specific mortality [35], which is consistent with this study. One of the rational explanations is that the liver metabolizes carcinogens differently between male and female [35]. Besides, this study also revealed that patients with higher grade, higher tumor stage, higher N stage and higher T stage experienced significantly lower OS and poor prognosis. The tumor microenvironment consists of stromal cells, immune cells and tumor cells. The higher the composition of immune cells and stromal cells, the lower the proportion of tumor cells. In this study, we revealed that high risk score had a positive association with higher stromal score and ESTIMATE score but a negative association with tumor purity. However, the correlation between immune score and this prognostic index was not significant. Therefore, patients with higher risk score has higher proportion of stromal cells, and lower proportion of tumor cells. These results revealed that this index could serve as an immune status indicator for BCa patients and might contribute to understanding the mechanism of immunotherapy.

Conclusions
Together, this study identified four prognostic hub immune-related genes, including MMP9, IGF1, CXCL12 and PGF, which might play a vital role in bladder cancer development. Besides, we developed a molecular prognostic index based on immunogenomic landscape analysis, which performed well in predicting prognosis of bladder cancer. Further researches are needed to verify the effectiveness of this index and these vital genes.