Identification of DEGs between normal and HCC tissues
The flowchart of this study was illustrated in Fig. 1. The expression data of 59 CMGs, including 13 well-defined B7-CD28 family costimulatory molecules and 46 TNF family costimulatory molecules genes, were extracted from The Cancer Genome Atlas (TCGA) database after excluding TNFRSF6B for its low expression. The 59 costimulatory molecule-related genes expression levels were compared between HCC tumor and normal tissues, we identified 40 differentially expressed genes (DEGs) (P < 0.05). Among these DEGs, 11 genes (NGFR, TNFSF11, PDCD1LG2, CD274, TNFRSF1A, TNFRSF11B, TMIGD2, FAS, TNFRSF10D, TNFSF13 and CD86) were down-regulated while 29 genes (TNFRSF17, TNFRSF13B, CD276, TNFRSF12A, LTBR, TNFSF18, EDAR, TNFRSF14, ICOSLG, RELT, CD28, ICOS, LTA, TNFRSF21, TNFRSF10C, VTCN1, TNFRSF11A, LTB, EDA2RC, TLA4, TNFSF9, TNFRSF25, PDCD1, CD70, TNFSF4, TNFRSF9, TNFRSF18, TNFSF15 and TNFRSF4) were up-regulated in tumor tissues (Fig. 2A and Additional file 1: Table S1). The correlation among CMGs were analyzed. The relationships between each two of them were almost positively correlated, TNFRSF13C and TNFRSF13B were most correlated (Cor = 0.93) (Fig. 2B). A protein–protein interaction (PPI) network was performed to further explore the interactions among these CMGs (Fig. 2C). The minimum required interaction score was set at highest confidence 0.9. The result showed that TNF, CD28, CD40, CD80, CTLA4, LTA, TNFRSF10A and TNFSF13B were hub genes.
Consensus clustering of prognosis-related CMGs in HCC
A univariate Cox regression analysis was performed to primary selecting of the survival-related genes from 59 CMGs. A total of 11 CMGs were significantly linked to the prognosis of HCC patients (P < 0.05). Two genes were protective genes with hazard ratio (HR) < 1, while 9 genes were risk factors with HR > 1 among them (Fig. 3A). To explore the associations between the expression of 11 prognosis-related CMGs and HCC subtypes, we performed a consensus clustering analysis to classify all HCC patients based on the expression patterns of 11 prognosis-related CMGs. The empirical cumulative density function (CDF) plot is aimed to determine the optimum cluster number (k) from 2 to 9 for the sample distribution to reach an approximate maximum, which means the maximum stability. When k = 2, the consensus matrix showed that HCC patients could be divided into two non-overlapping subgroups with the highest consensus and the clearest cluster partition (Cluster 1: n = 197, Cluster 2: n = 173) (Fig. 3B and Additional file 3: Figure S1A). The PCA were analyzed to verify the reliability between different subgroups, and Cluster 1 and Cluster 2 could gather together and non-overlapped with each other (Fig. 3C). We compared the OS between two subtypes to better understand the relationships between clustering results and survival outcomes, the Kaplan–Meier curves indicated that Cluster 1 had a better prognosis than Cluster 2 (P = 0.002, Fig. 3D). The clinical features and two clusters were compared with a heatmap. The majority of 11 prognosis-related CMGs had higher expression in Cluster 2. These two clusters were different in grade (P < 0.01), but not with tumor stage, age and gender (Fig. 3E). Furthermore, GSEA analysis showed that oncogenic pathways (apoptosis, G2M-checkpoint, IL2-STAT5-signaling, IL6-JAK-STAT3-signaling, inflammatory response, PI3K-AKT-MTOR-signaling, TNFA-signaling via NF-κB, unfolded protein response) were significantly enriched in Cluster 2 (Additional file 3: Figure S1B).
Construction and verification of costimulatory molecule-related risk signature
To narrow down candidate genes and construct the risk signature, the least absolute shrinkage and selection operator (LASSO) Cox regression analysis was performed in the training set, and 6 of 11 prognosis-related CMGs were identified (Additional file 4: Figure S2). The formula to calculate the risk score as follows: risk score = (0.25584 * TNFSF4) + (−0.29002 * TMIGD2) + (0.13379 * TNFRSF4) + (0.22009 * TNFRSF11B) + (0.40207 * TNFRSF11A) + (−0.78099 * CD40LG). We calculated the risk scores for every HCC patient in the training set according to the above formular. Patients in the training set were divided into high- and low-risk groups based on their median risk sore. A significant difference of OS was observed in different subgroups. High-risk patients had a poorer OS than low-risk groups (P < 0.001) (Fig. 4D). Time-dependent receiver operating characteristic (ROC) analysis was used to evaluate the sensitivity and specificity of the risk signature. The areas under the curve (AUC) were 0.756 at 1-year survival, 0.791 at 3-year survival and 0.729 at 5-year survival (Fig. 4E). We ranked the risk scores of patients and analyzed their distribution in the training set (Fig. 4A). The survival status of HCC patients in the training set was showed on the dot plot (Fig. 4B). The heatmap displayed the expressions of 6 prognosis-related CMGs between two risk groups (Fig. 4C).
To determine the stability of the risk signature, we further verified the predictive capability in the test set and total set. The risk score was calculated for each patient in the test set and total set by the same formular obtained from the training set and the patients were classified into high- and low-risk groups. Similarly, Kaplan–Meier survival curve showed significantly difference in two risk groups among the test set. The OS of the high-risk groups was poorer than that of the low-risk groups (P = 0.019) (Fig. 4I). The 1-year AUC was 0.728, the 3-year AUC was 0.644 and the 5-year AUC was 0.654 (Fig. 4J). The survival status, the distribution of the risk score and the expression heatmap of 6 prognosis-related CMGs in the test set were presented in Fig. 4F–H.
The results in the total set were similar to the training set and test set. Patients in the high-risk group had a significantly shorter prognosis than patients in the low-risk group (P < 0.001) (Fig. 4N). In the total set, the AUC was 0.739 at 1 year, 0.708 at 3 years and 0.662 at 5 years (Fig. 4O). The distribution of the risk score, survival status and the expression patterns of 6 prognosis-related CMGs were showed in Fig. 4K–M.
Independent prognostic value of the risk signature
We performed the univariate and multivariate Cox regression analyses to examine whether the risk score could act as an independent prognosis variable of HCC. Univariate Cox regression analysis showed that pathological tumor stage and risk score were significantly associated with the prognosis (Fig. 5A, C, E). Multivariate Cox regression analyses further identified that the risk score was an independent prognostic factor for OS in the training set, test set and total set (Fig. 5B, D, F).
Correlations between the risk signature and clinicopathological factors
The association between the risk model and clinical characteristics were analyzed to further verify the prognostic value of the risk signature in HCC. The heatmap displayed the expressions of 6 prognosis-related CMGs and the distribution of clinicopathological characteristics in high- and low-risk groups. The risk score was significantly correlated with histological grade, pathological T stage and clinical stage (Fig. 6A). Differences in clinicopathological factors between high- and low-risk groups were showed in Fig. 6B. The risk score of patients with stage III-IV was higher than that of patients with stage I-II (P = 0.0014). Patients with G3–4 showed a remarkably higher risk score than those with G1–2 (P < 0.001). The risk score in T3–4 patients was significantly higher than that observed in T1–2 patients (P = 0.0015). Nonetheless, there were no significant differences among age, gender, N stage and M stage. Furthermore, patients were divided into different subgroups according to the following clinical variables, including age (≤ 65 and > 65), gender (female and male), clinical stage (stage I-II and stage III-IV), grade (G1–2 and G3–4), T stage (T1–2 and T3–4), N stage (N0 and N1–3) and M stage (M0 and M1). The correlation between the risk score and clinicopathological features on OS was explored. Survival analysis manifested that higher risk score were correlated with poor prognosis in age (age ≤ 65 with P = 0.005 and age > 65 with P < 0.001), gender (P < 0.001 in male), stage (stage I-II with P = 0.002 and stage III-IV with P = 0.026), grade (P < 0.001 in G1–2 and P = 0.029 in G3–4), T stage (P < 0.001 in T1–2 and P = 0.01 in T3–4), stage N0 (P < 0.001), and stage M0 (P < 0.001) (Fig. 6C).
Construction of a novel nomogram
We constructed a nomogram to predict the survival rates for HCC patients based on age, gender, histological grade, TNM stage, tumor stage and risk score (Fig. 7A). The calibration curves of the nomogram indicated good consistency between the predicted survival rate and actual 1-, 3- and 5-year survival rate (Fig. 7B). The AUCs of risk score and tumor stage were 0.739 and 0.671 in 1-year, 0.698 and 0.680 in 3-year, 0.638 and 0.663 in 5-year, respectively (Fig. 7C). These findings suggested that the risk signature might be reliable to predict the OS for HCC patients.
Functional enrichment analyses based on the risk signature
To explore the potential biological processes for the prognostic risk signature, a total of 474 DEGs were obtained in high- and low-risk groups with the criteria FDR < 0.05 and |log2FC |≥ 1. Among them, 63 genes were downregulated in high-risk group, while 411 genes were upregulated (Additional file 2: Table S2). Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways analyses were carried out based on the DEGs. The results of GO analysis indicated that the DEGs were mainly related to nuclear division. KEGG analysis showed that DEGs were mostly enriched in cell cycle (Additional file 5: Figure S3).
Association between risk signature and tumor immune microenvironment
The differences of tumor-infiltrating immune cells between high- and low-risk groups were analyzed to explore the correlations between the prognostic risk signature and tumor immune microenvironment (TIME). Additional file 6: Figure S4 displayed the abundance of 22 immune cells between high- and low-risk subgroups. Among 22 immune cell types, memory B cells and macrophage M0 were positively correlated with the risk score, while the abundance of naïve B cells, plasma cells and regulatory T cells (Tregs) were significantly enriched in low-risk group (Fig. 8A). Furthermore, the relative proportion of naïve B cells, resting memory CD4 T cells, activated memory CD4 T cells, regulatory T cells, gamma delta T cells, macrophage M1 and resting mast cells were significantly associated with OS (Fig. 8B).
To further explore the relationship between the risk signature and immune status, we performed the expression profiles of 29 immune signature gens sets (16 types of immune cells and 13 immune-related pathways) in high- and low-risk groups using the single-sample gene set enrichment analysis (ssGSEA). The heatmap dis1layed the significant differences in immune status between high- and low-risk samples (Fig. 9A). The low-risk subgroup showed higher levels of infiltration of immune cells and higher activity of immune-related pathways (Fig. 9B). We found that the immune score and stromal score were higher in low-risk groups, while the tumor purity was significantly lower in low-risk subgroup (Fig. 9C).
Differences in molecular characteristics between high- and low-risk groups
We evaluated the relationship between mutation characteristics and the risk signature in TCGA HCC patients with available somatic mutation data. TMB was higher in high-risk patients in spite of no significant difference (Fig. 10B). We also identified the top 20 genes with the highest mutation rates in high- and low-risk subgroups (Fig. 10A). Additionally, we explored the association between immunophenoscore (IPS) and risk signature to predict the potential clinical efficacy and the response to ICI therapy in HCC patient. The IPS, IPS-CTLA4, IPS-PD1-PD-L1- PD-L2, and IPS-PD1-PD-L1-PD-L2-CTLA4 blocker were significantly higher in low-risk group, implying that HCC patients with low-risk score could benefit more from ICI therapy than high-risk patients (Fig. 10C).
Verification of prognostic CMGs
We verified the expression of the CMGs (TNFSF4, TNFRSF4, TMIGD2, TNFRSF11A, TNFRSF11B, CD40LG) in 43 pairs of tumorous and non-tumorous tissue specimens from patients with HCC using qRT-PCR analysis (Fig. 11). The results of qRT-PCR showed that the expression of TNFSF4, TNFRSF4, TNFRSF11A and CD40LG was higher in HCC tissues compared to normal tissues. However, the mRNA expression of TNFRSF11B and TMIGD2 was higher in normal tissues, which was consistent with the results of bioinformatic analysis. The protein expression of the 6 prognostic gene signatures in HCC tissues and normal liver tissues was verified in HPA online database (Fig. 12). The results showed that the expression of TNFRSF11A was increased and the expression of TNFRSF11B was decreased in HCC tissues. TNFRSF4 and TMIGD2 were not detected in hepatocytes. In addition, the staining results of TNFSF4 and CD40LG did not reach significant difference according to HPA database.