Comprehensive analysis of LDHAP5 pseudogene expression and potential pathogenesis in ovarian serous cystadenocarcinoma

Background We aimed to identify differentially expressed pseudogenes and explore their potential functions in four types of common gynecological malignancies (e.g., cervical squamous cell carcinoma, ovarian serous cystadenocarcinoma, uterine corpus endometrial carcinoma, and uterine carcinosarcoma) using bioinformatics technology. Materials and methods We identified up-regulated and down-regulated pseudogenes and built a pseudogene-miRNA-mRNA regulatory network through public datasets to explore their potential functions in carcinogenesis and cancer prognosis. Results Among the 63 up-regulated pseudogenes identified, LDHAP5 demonstrated the greatest potential as a candidate pseudogene due to its significant association with poor overall survival in ovarian serous cystadenocarcinoma. KEGG pathway analysis revealed that LDHAP5 showed significant enrichment in MicroRNAs in cancer, Pathway in cancer and PI3K-AKT signaling pathway. Further analysis revealed that EGFR was the potential target mRNA of LDHAP5, which may play an important role in ovarian serous cystadenocarcinoma. Conclusions LDHAP5 was associated with the occurrence and prognosis of ovarian serous cystadenocarcinoma, and thus shows potential as a novel therapeutic target against such cancer.


Background
Gynecological malignancies account for a large proportion of tumors in women and seriously endanger female health. It is estimated that there will be approximately 13,800 new cases of uterine cervical cancer, 65,620 cases of uterine corpus cancer, and 21,750 cases of ovarian cancer in the United States in 2020, and with 4290, 12,590 and 13,940 possible deaths, respectively [1]. Advanced gynecological malignancies usually exhibit poor prognosis due to a lack of effective treatment in controlling distant metastasis [2]. However, most current clinical drugs are non-specific, and their therapeutic effects are limited [3]. Therefore, the identification of novel biomarkers of gynecological tumors to improve drug efficacy and prolong survival remains urgent.
The term pseudogene was first conceived by Jacp et al. [4]. Pseudogenes usually originate from paralogous functional genes ("parent gene"), but have lost the capacity to encode functional proteins due to the accumulation of mutations (e.g., frameshift mutations, early or delayed stop codons) [5]. Pseudogenes

Open Access
Cancer Cell International *Correspondence: pengwu8626@tjh.tjmu.edu.cn 1 Cancer Biology Research Center (Key Laboratory of the Ministry of Education), Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China Full list of author information is available at the end of the article initially received little attention until PTEN pseudogene 1 (PTENP1) was found to share the same microRNA response elements (MREs) as its homologous functional parent gene, PTEN [6].
With the advancement of next-generation sequencing (NGS), approximately 20,000 pseudogenes have been discovered in the human genome, and the role of pseudogenes as long non-coding RNAs (lncRNAs) in the development of disease has been revealed [7][8][9]. Current research suggests that pseudogenes mainly regulate gene expression at the post-transcriptional level through two pathways [10]. Firstly, pseudogenes can be used as competitive endogenous RNAs (ceRNAs) to competitively bind miRNAs with the coding gene, thereby positively regulating gene expression [11][12][13]. For example, PTENP1 can competitively bind miRNA-17, miRNA-21, miRNA-19, and other miRNAs through the ceRNA mechanism, thereby increasing parent gene (PTEN) expression by preventing miRNA-induced degradation [6]. Secondly, pseudogenes can play a negative role in the regulatory pathway, whereby they complete with their parent genes to destabilize RNA binding proteins (RBPs), resulting in a decrease in parent gene expression [14].
In the current study, we identified differentially expressed pseudogenes in four gynecological malignancies using the pseudogene database dreamBase, and then constructed a pseudogene-miRNA-mRNA regulatory network to further explore their potential functions and mechanisms in gynecological malignancies.

Screening for dysregulated pseudogenes in four gynecological malignancies
We obtained RNA-seq data of pseudogenes in 32 human cancer from the online database dreamBase (http://rna. sysu.edu.cn/dream Base/panca ncer.php?SClad e=mamma l&SOrga nism=hg38) [15] |Log2FC| > 2.0 was set as cutoff to identify differentially expressed pseudogenes. R v 3.5.1 and EXCEL v2016. were used to further analyze their expression landscape.

Prognostic analysis of up-regulated expressed pseudogenes
Gene Expression Profiling Interactive Analysis (GEPIA) (http://gepia .cance r-pku.cn/) was used to evaluate prognostic values (overall survival) of up-regulated pseudogenes in 32 kinds of common human cancer [16]. The group thresholds were as follows: the group cut-off was 'Median' , the 'cutoff-high' and 'cutoff-low' were 50%, axis units were 'Months' , and P value < 0.05 was considered statistically significant.

Screening for pseudogene-regulated miRNAs and miRNA-target mRNAs
The public online datasets of starBase v-2.0 and miR-TarBase were used to identify pseudogene-binding miRNAs and miRNA-target mRNAs, respectively [17,18]. The network of pseudogenes-miRNA-mRNA was constructed using Cytoscape v-3.7.2 [19].

KEGG pathways and gene oncology (GO) enrichment analysis of target mRNAs
The list of miRNA-target genes was imported into the STRING v-11.0, and the top five significantly GO terms and KEGG pathways were selected according to the values of false discovery rate (FDR), and then were visualized by GraphPad PRISM Version 6.02 [20].

Construction of protein-protein interaction network and identification of hub genes
STIRNG v-11.0 was used to construct the regulatory network of protein-protein, and then visualized by Centiscape plugin of Cytoscape v-3.7.2 [19][20][21]. The top 10 hub genes were identified according to the values of Degree unDir.

Hub genes expression and mutations analysis
Hub genes expression and mutations analysis in ovarian serous cystadenocarcinoma were analyzed using the online cBioPortal database [22]. 489 patients (TCGA, Nature 2011) with ovarian serous cystadenocarcinoma were selected for further analysis. The select genomic profiles were as follows: 'Mutations'; 'Putative copynumber alterations (GISTIC)'; 'mRNA/miRNA expression Z-scores (all genes)' , and the Z-scores threshold were ± 2. Finally, OncoPrint was obtained under the guidance of online database at c-BioPortal.

Identification of potential target gene of LDHAP5
Pearson correlation analysis between LDHAP5 and the top 10 hub genes expression in ovarian serous cystadenocarcinoma was performed using GEPIA [16]. Kaplan-Meier overall survivals of target genes were analyzed by Kaplan-Meier Plotter [23]. The mRNA expression levels of 10 hub genes in TCGA patients were further measured using Oncomine Main database [24].

Identification of dysregulated pseudogenes in four common gynecological malignancies
According to epidemiological statistics, cervical squamous cell carcinoma, ovarian serous cystadenocarcinoma, uterine corpus endometrial carcinoma, and uterine carcinosarcoma remain lethal diseases in women [1]. To explore the potential role of pseudogenes in carcinogenesis and cancer prognosis of four gynecological malignancies, we used the public dream-Base database to identify differentially expressed pseudogenes. As shown in Fig. 1a and Table 1, we identified 63 up-regulated and 0 down-regulated pseudogenes simultaneously in the four gynecological malignancies after preliminary screening. We then measured the expression levels of the 63 up-regulated pseudogenes in 32 types of human cancer (Fig. 1b). After removal of pseudogenes that were not highly expressed in the 32 types of human cancer, 40 pseudogenes were identified as playing potential roles in gynecological malignancies.

Prognostic analysis of up-regulated pseudogenes in 32 types of human cancer
We next explored the prognostic values of the 40 upregulated pseudogenes in the 32 kinds of human cancer using GEPIA. As shown in Fig. 2, KRT8P3, KRT8P45, and LDHAP5 predicted poor overall survival in ovarian serous cystadenocarcinoma (HR = 1.3, P = 0.046; HR = 1.3, P = 0.019; HR = 1.3, P = 0.03, respectively), FTLP14 predicted poor unfavorable prognosis in uterine corpus endometrioid carcinoma (HR = 2.6, P = 0.018) No other pseudogenes that were significantly correlated with poor prognosis in the four types of gynecological malignancies.

KEGG pathway and gene oncology (GO) enrichment analysis of miRNA target mRNAs
The 148 miRNA target genes were imported into STRING v-11.0, with GO (Fig. 3c). These findings confirmed that the LDHAP5 pseudogene may mediate the occurrence and progression of ovarian serous cystadenocarcinoma.

Discussion
With deepening research, we continue to gain a better understanding of pseudogenes. Currently, there are two major pseudogene classifications. Firstly, pseudogenes can be divided into three categories based on differences in structure and origin, i.e., duplicated, unitary, and processed pseudogenes, respectively. Duplicated pseudogenes are caused by mutations of the gene coding region or regulatory region in the process of genome DNA tandem replication or chromosome unequal exchange [25]. Unitary pseudogenes cannot be transcribed or translated because of spontaneous mutations in the coding or regulatory regions of a single copy gene with coding function [26]. Both duplicated and unitary pseudogenes are collectively called unprocessed pseudogenes. Processed pseudogenes are formed by the random integration of mRNA transcripts into cDNA and lose their normal functions due to improper insertion sites or sequence mutations [27,28]. Secondly, pseudogenes can be classified based on their functions into pseudogenes that can be transcribed, pseudogenes that cannot be transcribed, and pseudogenes that can encode short-chain peptides or truncated proteins. These pseudogenes play important roles in carcinogenesis and cancer prognosis [29][30][31]. Centered on the ceRNA hypothesis, our research focused on pseudogenes that can be transcribed into mRNA. We used the pseudogene-miRNA-mRNA regulatory network to identify pseudogenes that may play potential roles in common gynecological malignancies and to explore their related mechanisms.
The initial goal of our study was to discover pseudogenes that were differentially expressed in four common gynecological malignancies. However, we only found three and one significantly up-regulated pseudogenes that predicted poor prognosis in ovarian serous cystadenocarcinoma and uterine corpus endometrioid carcinoma after Kaplan-Meier survival analysis. We selected LDHAP5 as the candidate pseudogenes as it had corresponding miRNAs. There are two reasons accounting for the lack of pseudogenes. Firstly, many pseudogenes remain unidentified. Initially, pseudogenes were considered as "junk" or "fossil" DNA, and many methods were developed to avoid their detection [32][33][34][35][36]. The second possibility is that the current ceRNA hypothesis is not yet perfect, and further analysis is needed to build a more comprehensive regulatory network [37].
In our study, 148 potential target mRNAs were identified. Functional enrichment analysis showed the top five significantly enriched gene sets were MicroRNAs in cancer (hsa05206), Pathway in cancer (hsa05200), PI3K-AKT    [38,39]. More significantly, studies have shown that EGFR is dysregulated in many solid tumors, and PI3K-AKT signaling can be used as a downstream regulatory pathway for EGFR to mediate the occurrence and progression of disease, as confirmed in many cancers [40,41]. Our research has several limitations. Specially, our conclusions are primarily based on the analysis of existing databases. To further confirm the role of the LDHAP5 pseudogene at the in vivo and in vitro level, we need to construct ovarian cancer cell lines that differentially express LDHAP5, with clinical pathological specimens from ovarian cancer patients also used to verify our findings. EGFR antagonists (e.g., gefitinib, lapatinib, erlotinib) have been used in a variety of cancers, including pancreatic, small cell lung, and colorectal cancer [42][43][44]. Once our research is successfully validated, it may be used in ovarian cancer in the future. With continuing research, more pseudogene functions and corresponding mechanisms will be revealed, which could help in the identification of novel biomarkers, development of specific drug design, and the adoption of personalized treatment in the future.

Conclusions
This study is the first to report on the high expression of the LDHAP5 pseudogene in ovarian serous cystadenocarcinoma, which may lead to poor prognosis via its targeting of EGFR. Thus, LDHAP5 may serve as a new therapeutic target, and improve the prognosis of patients with ovarian cancer in the future.