Expression profile of RNA binding protein in cervical cancer using bioinformatics approach

Background It has been demonstrated by studies globally that RNA binding proteins (RBPs) took part in the development of cervical cancer (CC). Few studies concentrated on the correlation between RBPs and overall survival of CC patients. We retrieved significant DEGs (differently expressed genes, RNA binding proteins) correlated to the process of cervical cancer development. Methods Expressions level of genes in cervical cancer and normal tissue samples were obtained from GTEx and TCGA database. Differently expressed RNA binding proteins (DEGs) were retrieved by Wilcoxon sum-rank test. ClusterProfiler package worked in R software was used to perform GO and KEGG enrichment analyses. Univariate proportional hazard cox regression and multivariate proportional hazard cox regressions were applied to identify DEGs equipped with prognostic value and other clinical independent risk factors. ROC curve was drawn for comparing the survival predict feasibility of risk score with other risk factors in CC patients. Nomogram was drawn to exhibit the prediction model and validated by C-index and calibration curve. Correlations between differentially expressed RNA binding proteins (DEGs) and other clinical features were investigated by t test or Cruskal Wallis analysis. Correlation between Immune and DEGs in cervical cancer was investigated by ssGSEA. Results 347 differentially expressed RBPs (DEGs) were retrieved from cervical cancer tissue and normal tissue samples. GO enrichment analysis showed that these DEGs involved in RNA splicing, catabolic process and metabolism. Cox regression model showed that there were ten DEGs significantly associated with overall survival of cervical cancer patients. WDR43 (HR = 0.423, P = 0.008), RBM38 (HR = 0.533, P < 0.001), RNASEH2A (HR = 0.474, P = 0.002) and HENMT1 (HR = 0.720, P = 0.071) played protective roles in survival among these ten genes. Stage (Stage IV vs Stage I HR = 3.434, P < 0.001) and risk score (HR = 1.214, P < 0.001) were sorted as independent prognostic risk factors based on multivariate cox regression. ROC curve validated that risk score was preferable to predict survival of CC patients than other risk factors. Additionally, we found some of these ten predictor DEGs were correlated significantly in statistic with tumor grade or stage, clinical T stage, clinical N stage, pathology or risk score (all P < 0.05). Part of immune cells and immune functions showed a lower activity in high risk group than low risk group which is stratified by median risk score. Conclusion Our discovery showed that many RNA binding proteins involved in the progress of cervical cancer, which could probably serve as prognostic biomarkers and accelerate the discovery of treatment targets for CC patients. Supplementary Information The online version contains supplementary material available at 10.1186/s12935-021-02319-7.


Introduction
One of the most challenging malignancies is cervical cancer (CC) observing among females worldwide [1]. It is showed that CC led to more than about 311 thousand people death all over the world in statistically in 2018 [2]. One of major reasons for cervical cancer is infection of high-risk human papillomavirus (HPV), although the occurrence of cervical cancer cannot be fully elucidated by HPV infection [3]. Radical hysterectomy, radical radiotherapy and chemotherapy based on cisplatin are major treatment methods for CC patients until now [4]. It is reported that patients with locally advanced cervical cancer had their 5-year overall survival (OS) increased into about 70% after chemotherapy [5]. Nonetheless, the recurrence of cervical cancer after surgery or radiotherapy remains a problem difficult to solve [6]. The circumstance of limited treatment and a poor prognosis is the reality that CC patients with relapse have to face [7,8].
RNA binding proteins (RBPs) belong to one of the crucial series cellular proteins. Their interaction with RNA by means of recognizing special RNA binding domains plays a significant role in varies kinds of post-transcriptional regulation. For example, RNA transport, translation control, intracellular localization, shearing, sequence editing are both under the influence of RNA binding proteins [9]. Former studies have discovered more than 1500 proteins who involved in RNA binding in Homo sapiens genome [10]. There is a significant district in RBPs, which contains 60-100 residues. This district often adopts an αβ topology which assists them to bind the RNA according to concrete nucleic acid sequence [11]. The origination and development of many diseases have been discovered to be correlated with RBPs. For example, spinal muscular atrophy and myotonic dystrophy are two kinds of typical disease [12]. Undoubtedly, the origination and development of cancer has been reported to have something to do with RBPs. For example, HuR, which is a RBP is able to accelerate the proliferation and promote metastasis of gastric cancer [13]. Zhang et al. reported that AGO2 increases oncogenic miR-19b biogenesis by Acetylation which leads to the facilitation of cancer progression [14]. The proliferation of lung cancer cells can be regulated by cancer-associated alternative splicing. This process is inhibited by QKI-5 [15]. ESRP1 accelerates the EMT of ovarian carcinoma cells [16]. All these researches revealed RBPs as important adjustment molecules in the process of cancer development.
Nowadays, FIGO stage serves as a majority tool for doctor to predict the survival of cervical cancer patients in clinical [17]. There is deficiency in FIGO stage system that patients may have different individual survival time even if they are attributed to same FIGO stage [18]. In order to provide doctors with a better prognostic prediction tool for CC patients, more clinical factors should be taken into consideration. Recently, the prognosis model involved with the expression level of RBPs has become popular and been constructed in colorectal cancer [19], hepatocellular cancer [20] and breast cancer [21], etc. So, the prognostic prediction role of RNA binding proteins in CC trigged our interest. To begin with, the differently expressed RNA binding proteins (DEGs, differently expressed genes) were retrieved from gene expression profile of tumor tissues and normal tissues. They were uploaded to STRING database for constructing protein-protein interaction network. A cox regression model for predicting the survival of cervical cancer patients was constructed with DEGs involved in the PPI network prognosis signature. The predict factors involved in this cox regression model had been validated by the Kaplan-Meier analysis. The survival status discrimination efficacy of risk score was compared with other clinical factors by means of ROC curve and quantified by area under the curve (AUC). Moreover, GO and KEGG enrichment analysis was applied to explore the functional pathways that screened DEGs in PPI network and their subnetworks involved in. Finally, we also explored the relationships between the risk scores which was counted by DEGs signature and immune cells or functions.

Acquisition data from GTEx and TCGA dataset
The expression level of genes in normal cervix was downloaded from GTEx database (https:// www. genome. gov/ Funded-Progr ams-Proje cts/ Genot ype-Tissue-Expre ssion-Proje ct), which is a website containing great quantity of gene expression data resourced from healthy people. The expression level of genes in cervical cancer and the corresponding clinical data were downloaded from TCGA database (https:// portal. gdc. cancer. gov/), which is a landmark cancer genomics program. Clinical pathological data of patients from TCGA is available in Additional file 1: Table S1. Gene array data from GTEx and TCGA was normalized by means of limma package from R bioconductor software. Totally 1542 RBPs (RNA binding proteins) [10] were obtained to screen the gene expression profile.

Identification of DEGs (different expression genes, different expression RBPs)
Wilcoxson signed-rank test was applied to identify differentially expressed RNA binding protein genes (DEGs). The cut-off values were set based on left parameters, false discovery rate (FDR) < 0.05 and |log2 fold change (logFC)|> 0.5.

Functional enrichment analysis
Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis were performed by clusterProfiler package [22] in R software. Results was filtered by FDR (false discovery rate) < 0.05, top result were presented and recognized as significant items.

Construct protein-protein interaction network and the subnetwork
The sorted DEGs were applied to construct proteinprotein interaction network by STRING (The Search Tool for the Retrieval of Interacting Genes). The PPI network was visualized by Cytoscape software. The MCODE which is a plug-in for Cytoscape was applied to obtain the first three relevant sub-network modules. The genes in them were applied to perform GO and KEGG enrichment analysis respectively.

Construct cox regression model
The correlation between the overall survival (OS) and differently expressed RNA binding proteins (DEGs) was firstly investigated by univariate cox regression. Variables were screened by p-value (< 0.05) presented from Wald X 2 test. All the significant variables were applied to construct multivariate cox regression model and sorted by AIC value. Then, the expression level of significant DEGs were included into multivariate cox regression with other clinical factors to construct prognostic predict model.

ROC curve analysis
The prediction value of independent risk factors was investigated by Receiver Operating Characteristic (ROC) curve with area under the curve (AUC). ROC curve was draw by survival ROC package in R software, which is designed for survival data. AUC was applied to measure the sensitivity and specificity of prediction variable, which ranges from 0.5 to 1. The larger the AUC is, the better the variable predict the prognosis.

Development of the nomogram
Nomogram was drawn to exhibit the prediction model constructed based on independent prognostic factors sorted by univariate and multivariate cox regression. In order to evaluate the calibration and discrimination of nomogram, calibration curves were plotted and Harrell C-index was calculated. Bootstrapping validation with 100 bootstraps resample was applied to calculate C-index for this nomogram.

Immune and RNA binding protein in cervical cancer
The infiltrating scores of 16 immune cells was measured by the single-sample gene set enrichment analysis (ssGSEA) in the "gsva" R package. The activities of 13 immune-related pathways was evaluated in the same way [23]. The annotated gene set file applied in ssGSEA analysis is provided in Additional file 2: Table S2 [24]. TIMER-2 database (http:// timer. cistr ome. org/) was applied to explore the correlation between prognostic significant RBP genes and six kinds of immune cells (T cell CD8+ , T cell CD4+ , B cell, Neutrophil, Macrophage, Myeloid dendritic).

Experimental validation by RT-qPCR
RT-qPCR experiment was conducted to validate the expression level of five differentially expressed RNA binding proteins retrieved by bioinformatics analysis. CC specimens and correspondent normal cervical tissues come from CC patients who received operation from 2020 to 2021 January in Shanghai east hospital, school of medicine, Tongji University. The clinical pathological data of patients is available in Additional file 3: Table S3. Internal review board of Shanghai East Hospital, school of medicine, Tongji University have approved this study.
Total RNA was retrieved by TRIzol (invitrogen, USA) from CC tissues and correspondent normal tissues. The purified RNA was transcribed into cDNA (Complementary DNA) by PrimeScript ® RT reagent Kit with gDNA (genomic DNA) Eraser (Takara). SYBR master mix kit (vazyme, China) was used to detect the expression level of these DEGs (Differently expressed genes, RNA binding proteins) on the QuantStudio RT-qPCR System (Q6, Applied Biosystems, USA). Endogenous GAPDH (gyceral-dehyde-3-phosphate dehydrogenase) or ACTB (beta actin) was used to normalize the expression level of each genes by 2 −△△Ct . In order to display whether the expression level of gene is up-regulated or down-regulated in tumor tissues comparing with corresponding normal tissues, the relative expression level of each gene was log2 transformed for barplot. The primers were synthesized by Sangon company and Genscript company, China.

Kaplan-Meier analysis of prognostic significant RBPs
The patients suffered from cervical cancer were divided into high expression group and low expression group by means of expression level of each prognostic significant RBPs. The survival curves of high expression group and low expression group were drawn by survival package in R software. Their overall survivals were compared by logrank test.

OncoPrint evaluation of sorted RBPs in cervical cancer by cBioPortal
In order to explore the gene expression variation of prognosis significant RNA binding proteins across cervical cancer tissues from patients respectively. cBioPortal OncoPrint (http:// cbiop ortal. org, accessed on 10 OCT 2021) was applied to draw a graphical summary, in which 278 cervical cancer samples from patients in TCGA database (TCGA, PanCancer Atlas) involved. In integrated graph, cervical cancer patients were represented as columns, prognosis significant RNA binding proteins were represented as rows. Colour codes and glyphs stood for genomic alterations such as missense mutation, truncating mutation, amplification or deep deletion.

Sort DEGs from RNA binding protein gene expression profile
An expression profile dataset was combined with data from GTEx and TCGA database, which included 13 normal cervix samples and 306 cervical cancer samples. The clinical data of the patients was downloaded from TCGA database and integrated into expression matrix by Perl software. We obtained 347 differently expressed RNA bind proteins (DEGs) by comparison of the expression level of RNA bind proteins (DEGs) between cervix tissues and cervical cancer tissues. There were 177 genes down-regulated in tumor samples and 170 genes up-regulated in tumor samples compared with normal samples (Additional file 4: Table S4). The detail of DEGs' expression matrix was presented by heat map and volcano plot ( Fig. 1a, b).

Bio-functional enrichment analysis of DEGs
We conducted GO and KEGG enrichment analysis by clusterprofiler package in R software to evaluate the biological function of our retrieved DEGs. These differently expressed RBPs were divided into up-regulated group and down-regulated group for enrichment analysis individually.
In GO analysis, for BP (biological process) category, downregulated DEGs mainly enriched in RNA splicing, RNA catabolic process and mRNA catabolic process. For CC (cellular components) category, downregulated DEGs mainly enriched in cytoplasmic ribonucleoprotein granule, ribonucleoprotein granule and cytoplasmic stress granule. For MF (molecular function) category, downregulated DEGs mainly enriched in catalytic activity, acting on RNA, single-stranded RNA binding and mRNA 3'-UTR binding. In KEGG pathway analysis, downregulated DEGs mainly enriched in mRNA surveillance pathway, RNA transport and Ribosome ( Fig. 2a-d) (Additional file 5: Table S5, Additional file 6: Table S6).
In GO analysis for up regulated group, biological process (BP) category mainly includes ncRNA processing, RNA phosphodiester bond hydrolysis, and tRNA metabolic process. They have something to do with the process of cervical cancer. Items in cellular component (CC) mainly include ribonucleoprotein granule and preribosome. They involved in RNA binding and protein expression level adjustment, which may help cervical cancer cells survive. Molecular function (MF) category showed that DEGs were mainly enriched in catalytic activity, acting on RNA, double-stranded RNA binding and ribonuclease activity, respectively. They revealed the function of RBP. In result of KEGG analysis for upregulated DEGs, DEGs were enriched in pathways such as Ribosome biogenesis in eukaryotes, mRNA surveillance pathway and RNA transport. These pathways help cancer cells live a better life (Fig. 3a-d) (Additional file 7: Table S7, Additional file 8: Table S8).

Construct Protein-protein interaction (PPI) network
A PPI network was constructed by Cytoscape software. The information of nodes and network was obtained from STRING database according to uploaded DEGs. The PPI network incorporated 2545 edges and 320 nodes (Fig. 4a). The co-expression network was treated by MCODE plug-in for Cytoscape to identify the most correlated three subnetworks (Fig. 4b). Acquired first important crucial module consist of 27 nodes and 335 edges (Fig. 4c). The GO enrichment analysis result shows that the RBPs in the key module 1 were mainly enriched in ribosome biogenesis, preribosome and RNA helicase activity. Moreover, in KEGG analysis they were enriched in Ribosome biogenesis in eukaryotes pathway. The GO and KEGG analysis results of both three subnetworks were displayed in Tables 1 and 2.

Retrieve DEGs related to prognosis
To begin with, we retrieved prognosis related DEGs by Kaplan-Meier analysis and univariate cox regression with Wald X 2 test. In Kaplan-Meier analysis, patients were divided into two groups according to the median of expression level of DEGs. Their survivals were compared by log-rank test. The DEGs were considered significant when p-value of log-rank test is less than 0.05. Those DEGs was verified by univariate cox regression. Variables with Wald X 2 test p-value less than 0.05 was selected. It is validated that 18 DEGs (EIF3C, WDR43, BICC1, HEATR1, PRPF40B, RBM4, RBM38, CLK3, EEF1D, SAMD4A, CTU1, RNASEH2A, HENMT1, ENOX1, FBXO17, SMG8, ZC3HAV1L, NUFIP1) were significantly in statistic with the overall survival of CC patients (shown in Fig. 5a). Moreover, these genes were applied to construct multivariate cox regression model, AIC value was used to sort the variable. Ten DEGs (EIF3C, WDR43, PRPF40B, RBM38, EEF1D, CTU1, RNASEH2A, HENMT1, ZC3HAV1L, NUFIP1) were preserved at last and regarded as prognosis related DEGs ( Fig. 5b and Table 3).
Four genes (WDR43, RBM38, RNASEH2A, HENMT1) played protective roles (HR < 1) among these ten DEGs. The other six genes (EIF3C, PRPF40B, EEF1D, CTU1, ZC3HAV1L, NUFIP1) were presented Finally, the risk score was calculated according to the expression level of these ten genes ( riskscore = h 0 (t)exp( ∑ n j=1 Coef j × X j ) , n = 10, Coef j is the coefficient of each DEG, X j is the relative expression levels of each DEG, h 0 (t) is baseline risk function).

Draw prognostic hazard curves
Prognostic hazard curves was drawn to evaluate the survival time for the patients. It is observed that the survival time diminished with the increasing of risk score for the dead patients (Fig. 6a, b). Furthermore, the quantity of patients alive decreased with the ascend of risk score for patients too. RNASEH2A was shown to be down-regulated in group with high-risk according to the risk heat map. Whereas, CTU1 was regarded as a tumor accelerating gene because it was up-regulated in high-risk group (Fig. 6c).

Prognostic factors and prediction model for OS
The risk score and other clinical factors were combined to construct cox regression model. It is showed in univariate cox regression model that clinical stage and risk score were correlated with overall survival (OS) of CC patients (P < 0.001, P < 0.001, Fig. 7a). Multivariate cox regression validated that clinical stage (Stage IV vs Stage I HR = 3.434, P < 0.001) and risk score (HR = 1.214, P < 0.001) were independent risk factors for survival (Fig. 7b). For the sake of evaluating the discrimination of each predicting factors, ROC curves were constructed in 0.5-year, 1-year, 3-year and 5-year with the prediction factors (age, stage and risk score). Moreover, we assess the feasibility of discrimination of survival or dead patients using the area under curve (AUC) values. ROC curve reveals that the risk score showed a better ability to predict the survival of CC patients (AUC = 0.932, 0.843, 0.805, 0.832 for 0.5-year, 1-year, 3-year and 5-year) than other prediction factors (Fig. 8a-d).

Analyze relationship between clinical features and DEGs predictor
The correlations between the ten prognostic DEGs and clinical features was evaluated by t-test or Kruskal-Wallis test depend on the quantity of categories of clinical features. It showed that the expression level of CTU1 and ZC3HAV1L were significantly different expressed in statistic with each clinical stage (P-values = 0.013 and 0.040 respectively) (Fig. 9a, b). Furthermore, the expression level of CTU1 and ZC3HAV1L were higher in the advanced T stage patients (P-values = 0.009 and < 0.001 respectively), implying their dangerous roles with the development of cervical cancer (Fig. 9c, d). The expression level of CTU1 was significantly associated with N stages which implying that its expression levels increased with progression of lymph node metastasis (Fig. 9e). The expression level of EEF1D increased with advanced M stage, which implied that it's expression level may be correlated with the organ metastasis ability of cervical cancer (Fig. 9f ). The expression level of CTU1, RBM38, WDR43 varies with different pathology of cervical cancer (Fig. 9g-i). In addition, the expression level of EEF1D, RBM38 and WDR43 ascended with higher tumor pathology grade (p-value = 0.019, 0.020, 0.034) (Fig. 9j-l).

Establish and validate the nomogram
Three prognostic indicators including age, clinical stage and ten prognostic prediction RBPs were selected to establish the nomogram (Fig. 10a). The discrimination and calibration of nomogram was validated based on C-index and calibration curve. Analysis result revealed that the C-index of the constructed nomogram is 0.808 and the 1-year, 3-year and 5-year calibration curve in Fig. 10b-d demonstrated that the nomogram can partially predict the prognosis of CC patients.

Enrichment analysis of immune cell and function
SsGSEA R package was used to investigate the enrichment scores of 16 immune cell subpopulations and their 13 correlated immune functions. It is revealed that 5 kinds of immune cells (such as B cells, iDCs, mast cells, NK cells, pDCs) caught a lower score in high risk group than low risk group (Fig. 11a). What is more? The scores of the 2 types immune functions, such as HLA, Inflammation-promoting were significantly higher in low-risk group. Their enrichment scores suggested the immunological functions of high risk group should be injured more than low risk group classified by expression level of prognostic DEGs (Fig. 11b).

Kaplan-Meier analysis of prognosis significant RBPs
It is displayed in kaplan-Meier curves that higher expression level of EEF1D, CTU1, EIF3C, WDR43, NUFIP1, ZC3HAV1L and PRPF40B indicated a lower overall survival of cervical cancer patients. Nonetheless, higher expression level of HENMT1, RBM38 and RNASEH2A showed a better overall survival of cervical cancer patients. All of the overall survival differences between higher expression level group and lower expression level group of ten prognostic RBPs is significant in statistic which was verified by log-rank test (p < 0.05). (Fig. 13a-j).

OncoPrint analysis in cBioPortal
Gene expression variation of prognosis significant RNA binding proteins was explored by cBioportal tool with data of 178 CC tumors from patients in TCGA database. The clinical features of these patients were listed in Additional file 10: Table S10. OncoPrint analysis revealed that missense mutation was identified in WDR43, RBM38, HENMT1, EIF3C, PRPF40B, CTU1 and NUFIP1. Truncating mutation was discovered in WDR43 and PRPF40B. Amplification was seen in WDR43, RBM38, RNASEH2A, HENMT1, PRPF40B, EEF1D and CTU1. Deep deletion appears in RNASEH2A, HENMT1 and NUFIP1 (Fig. 14).

Discussion
Nowadays, malignant tumor has become one of the greatest intimidation to human health, which has exceed the cardiovascular disease [25]. Cervical cancer has become the second most common malignancies among females all over the world [26]. Especially, in developing countries, where it is not popular for females to take part in cervical screening, cervical cancer posed a greater threat to woman than developed country [27]. Cervical intraepithelial neoplasia (CIN) was recognized as the precursor lesions for cervical cancer. Persistent infection of human papillomaviruses (HPVs) is one of majority  [29]. Secondly, RNA binding proteins involve in the process, editing, stability maintenance, transportation and translation of message RNA [10,30]. Recently, microarray and RNA sequencing technologies have emerged as favorable tools for scientists to investigate the modification of cell's gene or gene transcript in the development of cancer [31].
There are few prognosis predictive tools for clinical doctor to evaluate the survival of patients suffer from cervical cancer. The most widely used prognosis predictive tools is the international federation of gynecology and obstetrics (FIGO) staging system [17]. Nonetheless, its degree of accuracy remained to be improved. So, more prognosis markers are required for constructing a better prognosis model. It is popular for scholars to excavate potential cervical cancer prognosis related factors. Yang et al. discovered nine prognosis related genes which play significant role in cervical cancer immune environment [32]. Chen uncovered six immune related long noncoding RNA and constructed a prognostic predictive model for CC patient [33]. Qin recognized DSG2 as a biomarker which could predict the prognosis of early-stage cervical cancer [34]. Gao et al. reported a sample of CC prognosis biomarkers with four microRNA and seven hub genes [35]. The research of role that microRNAs play in the development of cervical cancer has been extended to exosomes [36]. They acted as signal transmission molecular performing genetic exchange between cells or took part in the development of chronic inflammation after HPV infection, which indicated their potential role as prognosis biomarker in cervical cancer [37]. In general, microRNA is a member of non-coding RNA family, whose role has been explained in many aspect of CC development such as lymphatic invasion, distant metastasis and angiogenesis [38][39][40]. Both of these study provided us with innovative vision of prognosis prediction of cervical cancer patients. Follow the research idea of former study, we innovative proposed a sample of RNA binding proteins involved in the progress of cervical cancer which might be valuable for CC diagnose or treatment.
In this study, we applied gene expression level data resourced from RNA sequencing technology and the clinical data of CC patients to explore the cervical cancer correlated RNA binding proteins. The RNA sequencing data of 306 cervical cancer tissues and 13 normal cervix tissues form GTEx and TCGA databases was integrated to analyze the expression profile of differently expressed RNA binding proteins (also called DEGs in this article, Differently expressed genes) in cervical cancer. 347 DEGs was retrieved by Wilcoxon sum-rank test, of which 177 DEGs were down regulated in tumor samples and 170 DEGs were up regulated in tumor samples. The functional enrichment analysis of GO and KEGG were performed for the downregulated and upregulated DEGs respectively. The PPI network was constructed for sorting the candidate genes of prognostic prediction model by STRING database. Moreover, this DEGs was screened by cox regression with Wald X 2 test and Kaplan-Meier analysis with log-rank test. Among these DEGs, WDR43, RBM38, RNASEH2A and HENMT1 with HR < 1 played a protective role in survival. Other six genes (EIF3C, PRPF40B, EEF1D, CTU1, ZC3HAV1L, NUFIP1) were considered as risk factors with HR > 1. The nomogram was drawn to present the prognostic prediction model with FIGO stage and RBPs predictor. It was validated by   Fig. 14 The variation of prognosis related RNA binding proteins' expression in cancer tissues from CC patients were revealed by oncoprint analysis. Rows stand for genes and columns stand for CC patients. Missense mutation, truncating mutation, amplification and deep deletion are represented by glyphs and color codes C-index and calibration curve subsequently. In addition, the enrichment analysis of immune cell and function was performed by ssGSEA package in R software. We investigated the biological functions of These DEGs by GO analysis. To begin with, the enrichment of cell components was located in the ribosome, cytoplasmic ribonucleoprotein granule and the ribonuclease. They play crucial roles in transmission of genetic information from DNA to protein. Protein was synthesized in ribosome by translating the coding information from RNA. The mutation of ribosomal protein may exert an influence on degradation of p53 protein which involved in the process of many kinds of cancer, such as endometrial cancer, T-cell acute lymphoblastic leukemia, chronic lymphocytic leukemia and colorectal cancer [41]. Many kinds of disease have been reported having something to do with RNA processing or RNA metabolism, which exerted influence on RNA translation [42][43][44]. The forming of ribonucleoprotein complexes has been recognized as the result of interaction of RNA and RBPs. They sustain the stability of target message RNAs, after which the efficiency of mRNA translation is promoted. For example, oncogenic RNA binding protein SRSF1 is reported to accelerate the proliferation of lung cancer cells by strengthening the message RNA stability of DNA ligase 1 [45]. What is more, ribonucleoprotein granule was discovered as a crucial region for protein synthesis. The development of cancer is affected by the modification of ribonucleoprotein, because of its significant role in RNA translation [41]. Moreover, the category of molecular function in GO analysis revealed the interactions of RNA and proteins such as RNA methyltransferase activity. RBPs have been discovered to bind with many kinds of RNA such as pre-mRNA, snRNA, tRNA and mRNA. The regulation of various enzyme was also displayed in GO analysis such as endoribonuclease, ribonuclease and nuclease. They are correlated to synthesis or repair of DNA and metabolism of RNA. For example, in the field of correlation between cervical cancer and RNA methylation. Pan et al. developed a prognostic prediction model for cervical cancer patients based on m6A RNA methylation regulator [46]. While, most of research concentrate on the methylation of protein or DNA such as the promoters of genes instead of RNA. The underlying mechanism of RNA methylation and CC remains to be revealed. Finally, in term of biological process category of GO analysis, differently expressed RBPs have something to do with the processing of both coding RNA and non-coding RNA such as rRNA and tRNA. Both of RNA splicing and metabolism were adjusted by these differently expressed RBPs. Our result was consistent with the consensus reached before. It is reported that RNA binding protein (RBP) quaking (QKI) was able to interact with the QKI response elements (QREs) in SLC26A4 gene introns, which lies in the 3'UTR (3' untranslation region) of mRNA after transcription, thereby promoting circSLC26A4 biogenesis. CircSLC26A4 promotes the proliferation of cervical cancer cells in both vivo and vitro [47]. The RBP HuR was discovered to promote the growth of cervical cancer cells by interaction with the 3'UTR of RBP nucleolin (NCL) mRNA, which specifically promoted the translation of NCL without the alteration of NCL mRNA levels [48]. In other kind of cancer, RBP Musashi1 (Msi1) promoted the proliferation of colon cancer cells by target the 3'UTR of p21(cip1) [49]. Then, the items in KEGG pathway analysis suggested that the origination and development of cervical cancer is regulated by RBPs through mRNA surveillance pathway, RNA transport and RNA degradation. The underlying correlation between RBPs and signal pathways should be under research further.
The relationships between many RBPs and cancer has been reported by former studies which were consist with our study. We discovered that CTU1 is a risk factor for CC patients. It has been reported that CTU1/2, which is partner enzymes in U34 mcm 5 s 2 -tRNA modification, sustains metastasis and invasion of breast cancer [50]. Rapino et al. reported that the inhibition of CTU1 and proteins synergizing with it could kill melanoma cells [51]. Zhang et al. identified the copy number amplifications of CTU1 in 25% of myxopapillary ependymomas by means of whole exome sequencing [52]. CTU1 has been identified as one of prognostic predictors for prostate cancer and bladder cancer [53,54]. We also identified NUFIP1 as a risk factor of CC patients. The forming of ETV6-NUFIP1 fusion gene has been reported as a potential cause of acute lymphoblastic leukemia in Mexico [55]. Deshpande et al. reported that NUFIP1 had something to do with genome stability maintenance [56] which may help cancer cells survive the pressure from environment. Mutated genes NUFIP1 had a higher level of expression in metastasis tumor than primary tumor in neuroblastoma indicating its oncogenic driver role [57]. However, the potential role of NUFIP1 in the process of cervical cancer development remains to be revealed.
The risk score calculated by expression level of DEGs was demonstrated to be a risk indicator. Patients in high risk group shows a significant lower survival than low risk group. ROC curve for risk factors suggested that risk score predicted the prognosis better than other factors which may be valuable in clinical application. It suggested that more precise therapeutic strategy should be applied to CC patients with higher risk score. At last the expression of each DEGs were analyzed with patients' clinical features. CTU1 and ZC3HAV1L were significantly associated with clinic FIGO stage and T stage.