Identification of aberrantly methylated differentially expressed genes in prostate carcinoma using integrated bioinformatics

Background Methylation plays a key role in the aetiology and pathogenesis of prostate cancer (PCa). This study aimed to identify aberrantly methylated differentially expressed genes (DEGs) and pathways in PCa and explore the underlying mechanisms of tumourigenesis. Methods Expression profile (GSE29079) and methylation profile (GSE76938) datasets were obtained from the Gene Expression Omnibus (GEO). We used R 3.4.4 software to assess aberrantly methylated DEGs. The Cancer Genome Atlas (TCGA) RNA sequencing and Illumina HumanMethylation450 DNA methylation data were utilized to validate screened genes. Functional enrichment analysis of the screened genes was performed, and a protein–protein interaction (PPI) network was constructed using the Search Tool for the Retrieval of Interacting Gens (STRING). The results were visualized in Cytoscape. After confirmation using TCGA, cBioPortal was used to examine alterations in genes of interest. Then, protein localization in PCa cells was observed using immunohistochemistry. Results Overall, 536 hypomethylated upregulated genes were identified that were enriched in biological processes such as negative regulation of transcription, osteoblast differentiation, intracellular signal transduction, and the Wnt signalling pathway. Pathway enrichment showed significant changes in factors involved in AMPK signalling, cancer, and adherens junction pathways. The hub oncogenes were AKT1, PRDM10, and FASN. Additionally, 322 hypermethylated downregulated genes were identified that demonstrated enrichment in biological processes including positive regulation of the MAPK cascade, muscle contraction, ageing, and signal transduction. Pathway analysis indicated enrichment in arrhythmogenic right ventricular cardiomyopathy (ARVC), focal adhesion, dilated cardiomyopathy, and PI3K-AKT signalling. The hub tumour suppressor gene was FLNA. Immunohistochemistry showed that AKT1, FASN, and FLNA were mainly expressed in PCa cell cytoplasm, while PRDM10 was mainly expressed in nuclei. Conclusions Our results identify numerous novel genetic and epigenetic regulatory networks and offer molecular evidence crucial to understanding the pathogenesis of PCa. Aberrantly methylated hub genes, including AKT1, PRDM10, FASN, and FLNA, can be used as biomarkers for accurate PCa diagnosis and treatment. In conclusion, our study suggests that AKT1, PRDM10, and FASN may be tumour promoters and that FLNA may be a tumour suppressor in PCa. We hope these findings will draw more attention to these hub genes in future cancer studies.


Background
Prostate cancer (PCa) is the second most common malignant tumour in males, and it is the fifth leading cause of cancer mortality [1]. Due to prostate-specific antigen (PSA) screening and advanced biopsy techniques, early-stage PCa patients often show good prognosis after comprehensive treatment. However, PCa is a latent disease and can occur as an asymptomatic tumour in 20-to 30-year-old men [2]. The disease becomes symptomatic in the advanced stage, at which point fewer effective treatment options are available than at earlier stages [1]. Consequently, the overall survival (OS) of patients with advanced PCa is significantly diminished [3,4]. Therefore, new specific biomarkers for early PCa detection are urgently needed.
In recent years, tumour epigenetic modifications, acknowledged as inherited modifications in gene expression, including DNA methylation, histone acetylation, and noncoding RNA-related modifications, have garnered significant research interest [5]. As the main epigenetic modification, DNA methylation has been extensively studied with respect to angiogenesis, apoptosis, cell cycle regulation, and DNA damage repair [6,7]. Abnormal methylation, including hypomethylation of oncogenes and hypermethylation of tumour suppressor genes, is intimately involved in tumour pathogenesis and is significantly correlated with patient survival in many cancers [8,9].
Genetic testing based on microarray and sequencing platforms has emerged as a promising and effective tool to screen significant genetic or epigenetic changes in carcinogenesis and to identify biomarkers useful in diagnosis and prognosis determination [10]. A number of differentially expressed genes (DEGs) and differentially methylated genes (DMGs) have been identified in PCa though microarray analysis [11,12]. Although some studies have focused on specific genes with aberrant DNA hypermethylation or hypomethylation in PCa, an integrated analysis of gene expression, methylation, and signalling pathway interactions has not been performed.
In the present study, we assessed the interaction network of DEGs and DMGs along with interrelated signalling pathways in PCa by analysing gene expression microarray data (GSE29079), gene methylation microarray data (GSE76938), oncogenes, and tumour suppressor genes (TSGs) using bioinformatic tools; such an analysis has not been reported in previous research. We then used a dataset from the Cancer Genome Atlas (TCGA) as a validation cohort for our findings. We aimed to provide novel insights into the biological characteristics and pathways of DEGs/ DMGs in PCa and to identify putative biomarkers important for the development and progression of PCa. Target genes such as AKT1, PRDM10, FASN, and FLNA may play essential roles in the diagnosis and treatment of PCa.

Microarray data
In the current study, a gene expression dataset (GSE29079) and a gene methylation dataset (GSE76938)

Data processing
The expression and methylation were analysed with R 3.4.4 software (https ://www.r-proje ct.org/). To analyse the DEGs, we used a P < 0.05 and a |t| > 2 as cutoff standards. For the DMGs, we selected an FDR < 0.05 and a β > 0.2 as cut-off values. We obtained an oncogene list from the ONGene database (http://ongen e.bioin fominzh ao.org/) and a tumour suppressor gene (TSG) list from the TSGene database (https ://bioin fo.uth.edu/ TSGen e/index .html). Finally, hypomethylated highly expressed genes were identified based on the overlapping hypomethylated and upregulated genes. Similarly, hypermethylated genes with low expression were identified based on the overlapping hypermethylated and downregulated genes. We also used the VennDiagram package in R software to identify overlapping DEGs, DMGs, oncogenes, and TSGs.

Functional and pathway enrichment analysis
The Database for Annotation, Visualization, and Integrated Discovery (DAVID, https ://david .ncifc rf.gov/) was used to perform Gene Ontology (GO) enrichment analysis. DAVID is an online tool for systematic and integrative annotation and enrichment analysis that can be used to reveal biological meaning related to large gene lists [13]. GO analysis for the cellular component, biological process (BP), and molecular function (MF) categories [14] and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis [15] were performed for the selected genes (hypomethylated genes with high expression and hypermethylated genes with low expression) using the DAVID. A p value < 0.01 was considered statistically significant.

Protein-protein interaction (PPI) network generation and module analysis
The Search Tool for the Retrieval of Interacting Genes (STRING) is an online database used to predict PPIs, which are essential for interpreting the molecular mechanisms of key cellular activities in carcinogenesis. In this study, we used the STRING database to build a PPI network of hypomethylated genes with high expression and hypermethylated genes with low expression. The cut-off standard was defined as an interaction score of 0.4. The target hub genes used for further analysis had to meet the following 2 criteria: (i) they were oncogenes/TSGs; and (ii) they were in the top 30 genes according to 5 cyto-Hubba ranking methods using Cytoscape software. Subsequently, the PPI network was visualized by Cytoscape, and the Molecular Complex Detection (MCODE) algorithm in Cytoscape software was used to screen modules. An MCODE score > 3 and a node number > 5 were taken as the criteria to define a module.

Validation of the target genes in TCGA
To confirm the results, a dataset was downloaded from TCGA to validate the methylation and expression levels of the genes of interest. TCGA includes comprehensive, multi-dimensional maps of key genomic changes in various types of cancers. Additionally, the translational levels of the hub genes were validated using the Human Protein Atlas (HPA) database. The cBio Cancer Genomics Portal, an open platform for analysing large-scale cancer genomics datasets for various cancers, was used to explore the genetic alterations connected to the hub genes and to

Identification of aberrantly methylated DEGs in PCa
Data from each microarray were separately analysed by R software to obtain the DEGs or DMGs. A total of 6208 DEGs were obtained from the microarray data, including 3503 upregulated genes and 2705 downregulated genes. We identified 2382 hypermethylated and 4120 hypomethylated genes in a comparative analysis of normal tissue and tumour samples in GSE76938. We then overlapped the aberrantly methylated genes and DEGs and identified a common list of 536 hypomethylated genes expressed at high levels and 322 hypermethylated genes expressed at low levels. To further explore aberrantly methylated DEGs, we overlapped the hypomethylated high expressed genes with oncogenes and identified 33 hypomethylated highly expressed oncogenes. These findings suggest that aberrant methylation contributes to the high expression of oncogenes in PCa and subsequently promotes prostate tumourigenesis. Forty-two hypermethylated TSGs with low expression were identified by overlapping hypermethylated genes with low expression and TSGs, suggesting that hypermethylation results in the inhibition of TSGs in PCa, thereby promoting prostate tumourigenesis (Fig. 1). A representative heat map of GSE29079 (with the top 50 DEGs) is shown in Fig. 2. The top 50 DMGs between PCa tissue and normal tissue are represented in the heat map shown in Fig. 3.

GO functional enrichment analysis
GO enrichment analysis was conducted using the DAVID, and the results are illustrated in Table 1. For hypomethylated highly expressed genes, the terms enriched in the BP category included negative regulation of transcription from the RNA polymerase II promoter, osteoblast differentiation, intracellular signal transduction, the Wnt signalling pathway, and actin cytoskeleton organization. The GO cell component category revealed enrichment in the nucleoplasm, cytosol, nuclear body, cytoplasm, and Golgi cisterna membrane. In addition, the molecular function category showed enrichment for factors involved in transcriptional regulation of DNA binding, ligase activity, and GTPase activator and transcription coactivator activity. Hypermethylated genes with low expression showed enrichment in the BP category in processes such as positive regulation of MAPK cascade and phosphatidylinositol 3-kinase signalling, muscle contraction, ageing, and signal transduction. The enriched terms in the cell component category mainly included focal adhesion, cell surface, sarcolemma, and extracellular exosome and space. Additionally, the enriched molecular functions were focused on actin, protein, PDZ domain binding, glycosaminoglycan binding, and transmembrane receptor protein tyrosine kinase activity.

KEGG pathway analysis
The results of the KEGG pathway enrichment analysis implied that hypomethylated highly expressed genes were significantly enriched in AMPK signalling, cancer, and adherens junction pathways. Hypermethylated genes expressed at low levels demonstrated enrichment in arrhythmogenic right ventricular cardiomyopathy (ARVC), focal adhesion, hypertrophic cardiomyopathy (HCM), dilated cardiomyopathy, and PI3K-Akt signalling pathways (Table 2).

PPI network construction and cytoHubba analysis
The STRING was used to construct PPI networks.   (Table 3).

Module analysis
Overall, 7 modules in the network of hypomethylated genes with high expression and 3 modules in the network of hypermethylated genes with low expression were statistically significant (Figs. 8 and 9). The GO terms and KEGG pathways were then analysed ( Table 4). The results of the pathway enrichment analysis implied that hypomethylated highly expressed genes were significantly enriched in pathways associated with ubiquitinmediated proteolysis, GABAergic synapses, the cell cycle, endocytosis, purine metabolism, focal adhesion, and biosynthesis of amino acids. Hypermethylated genes with low expression demonstrated enrichment in cancer signalling and Rap1 signalling pathways.

Identification and validation of the six selected genes
We next used TCGA to validate our results. The outcome is summarized in Table 5. We found that methylation and expression statuses were also significantly altered in TGCA data, consistent with our findings. However, the expression of CCND1 was downregulated in tumour samples compared to normal samples. This finding needs to be confirmed by further experiments. With regards to methylation status, PRKCB was hypomethylated.

Genetic alteration related to the hub genes
We used cBioPortal software to explore genetic alteration related to the hub genes. We found that as a group, the hub genes were closely related to OS (Fig. 10a). We also observed a similar trend between hub genes and prognosis, although the relationship was not statistically significant. Figure 10b illustrates a network constructed with our 4 hub genes, their 50 most frequently altered neighbouring genes, and drugs targeting the hub genes (only 3 of the 4 had nodes or were targeted by drugs; the remaining gene, PRDM10, is not shown). Information on the alteration of the hub genes is exhibited in Fig. 10c, d. We found that these 4 hub genes were altered in 134 (27%) of the 498 sequenced cases/patients (499 total) and that AKT1 and FASN most frequently exhibited alterations (9% each), including amplification and deep deletion. Figure 10e shows the correlations between mRNA expression and DNA methylation for the hub genes in the TCGA Prostate Adenocarcinoma (PRAD) patient dataset; the correlations were negative, indicating that methylation regulated the mRNA expression of these genes (except FLNA, for which there were insufficient data). This finding suggests that methylation plays an important role in the expression of these genes. The results of the validation of the hub genes on a translational level through the HPA database are shown in Fig. 11. Immunohistochemistry showed that AKT1, FASN, and FLNA protein was mainly expressed in the cytoplasm of PCa cells, while PRDM10 protein was mainly expressed in the nucleus (Fig. 12).

Discussion
The initiation and development of PCa is a complex and multistage process regulated by genetic and epigenetic changes in pro-tumourigenic oncogenes and anti-tumourigenic TSGs. As with many other tumours, aberrant changes in epigenetic modifications, such as acetylation, phosphorylation, and in particular, DNA methylation, have been detected in PCa [16,17]. Identifying novel biomarkers in PCa will contribute to improving the diagnosis, treatment, and prognostic assessment of PCa patients. The GEO database is a free repository of microarray and next-generation sequencing analyses that was used to obtain expression profile (GSE29079) and methylation profile (GSE76938) datasets. R software can be used effectively for the analysis of genes in different groups of samples that are differentially expressed under various experimental conditions. In our study, 3 upregulated hypomethylated oncogenes and 1 downregulated hypermethylated TSG were identified. Functional enrichment analysis revealed that aberrant methylation affected certain pathways and hub genes. These results may offer novel insights into PCa pathogenesis. DAVID analysis of upregulated hypomethylated genes demonstrated enrichment of the genes in biological processes such as regulation of RNA polymerase II-driven transcription, osteoblast differentiation, intracellular signal transduction, the Wnt signalling pathway, and actin cytoskeleton organization. This finding is consistent with previous studies that have shown that PCa cells stimulate the differentiation of pre-osteoplastic cells through regulators of bone metabolism, thereby facilitating prostate cancer metastasis to bones [18]. Additionally, the Wnt cascade can act as a master regulator by integrating signals from the PI3K/mTOR, MAPK, and AR pathways [19]. The MF category in GO analysis largely showed enrichments in protein and transcription regulatory region DNA binding, ligase activity, and GTPase activator and transcription coactivator activity. Previous studies have reported that GTPase activators regulate intercellular junctions and are disrupted during tumourigenesis [20]. KEGG pathway enrichment analysis suggested significant enrichment in AMPK signalling, cancer, and adherens junction pathways, consistent with the fact that the activated AMPK pathway is involved in the growth and survival of human PCa [21].
Downregulated hypermethylated genes in PCa were enriched for positive regulation of MAPK cascade and phosphatidylinositol 3-kinase signalling, muscle contraction, ageing, and signal transduction in the BP category. The MF category in GO analysis indicated involvement shown that actin binding proteins can serve as key suppressors of cell migration and micrometastatic dissemination in PCa [22]. Additionally, tyrosine receptor kinase is an essential regulator of PCa proliferation and tumour growth [23]. KEGG pathway analysis revealed enrichment in ARVC, focal adhesion, HCM, dilated cardiomyopathy, and PI3K-AKT signalling pathways. The role of focal adhesion kinase (FAK) signalling in tumourigenesis and tumour progression has been extensively researched and has led to the development of FAK tyrosine kinase inhibitors as potential anticancer drugs [20]. Activation of the PI3K/AKT pathway has also been shown to play a major role in the aggressive nature of many prostate cancers [24]. Understanding the biological processes and signalling pathways in which aberrantly methylated DEGs are involved can help illuminate the pathogenesis of PCa and identify new therapeutic targets.
In the PPI network generated with Cytoscape, significantly more interactions than expected were observed for the aberrantly methylated DEGs. Importantly, a number of upregulated hypomethylated genes appeared to be involved in the tumourigenesis and tumour progression of PCa. We visualized the networks in Cytoscape, identified hub genes using cytoHubba, and validated the identified oncogenes AKT1, PRDM10, and FASN using  the TCGA PRAD patient dataset. The serine/threonine kinase Akt, with 3 isoforms (Akt1, Akt2, and Akt3), plays a critical role in regulating diverse cellular functions, including cell growth, proliferation, survival, transcription, and protein synthesis [25,26]. Akt1 expression is frequently elevated in breast and prostate cancers [27,28]. Additionally, Akt1 appears to be robustly involved in the tumourigenesis and invasion of cancer cells [29]. Studies have shown that Akt1 is almost completely hypomethylated in bladder cancer tissues [30]. However, the methylation pattern for Akt1 in PCa tissue has not been defined. The rate of Akt1 mutation in PCa was 9%; we hypothesize that the mutations cause the aberrant methylation and/or upregulation of Akt1. Akt1 is the target of the antitumour drug arsenic trioxide, which is currently used for patients with acute promyelocytic leukaemia [31]. Therefore, it is possible that Akt1 may serve as a potential drug target in other tumour types, including PCa. PRDM10 is a poorly studied member of the PRDM family. It lacks enzymatic activity and is believed to function as a transcriptional cofactor by recruiting histone-modifying enzymes to target promoters [32]. PRDM10 may serve as a transcriptional regulator for normal tissue differentiation and play important roles in promoting tumour development [33,34]. In PCa, PRDM10 has been found to be altered in approximately 7% of cases, indicating a similar mechanism may occur in PCa tumourigenesis. It has been reported that breast cancer cells show upregulated expression of PRDM10 associated with hypomethylation of the PRDM10 gene, suggesting the involvement of this gene in the proliferation and invasion of breast cancer cells [35]. Similarly, we found that hypomethylation of PRDM10 in PCa led to high expression in PCa samples, which may affect antitumour activity during tumour development. The Fatty Acid Synthase (FASN) protein-coding gene exhibits high expression levels in tumours, including PCa [36]. This gene has been shown to play critical roles in cancer progression and aggressiveness and to be highly associated with poor prognosis, high risk of disease recurrence, and drug resistance [37,38]. In our study, we found that hypomethylation of FASN led to high expression of FASN in PCa, indicating that FASN may function as a promoter of PCa tumourigenesis. FASN was altered in approximately 9% of PCa patients; we hypothesize that the mutations may drive the high expression of FASN in PCa. In addition, the FASN-targeting drug cerulenin can dose-dependently decrease HER2/neu protein levels in breast cancer cells (from a 14% decrease at 1.25 mg/L to a 78% decrease at 10 mg/L), and it has been suggested as a possible anticancer treatment [39]. In this study, we identified 3 oncogenes that were highly expressed in PCa samples compared to normal tissues, which suggested that these genes may play vital roles in PCa occurrence. Aberrant methylation of these 3 oncogenes may lead to their upregulated expression in PCa.
We identified FLNA as a TSG that was hypermethylated and downregulated in PCa. It has been reported that FLNA is required for the regulation of cell migration and invasion [40]. However, emerging evidence suggests that it may also be involved in different tumourigenic processes, such as DNA damage and angiogenesis [41,42]. Downregulation of this gene has been observed in a wide spectrum of human malignancies, including gastric cancers and renal cancers. FLNA has also been shown to be significantly correlated with lymph node metastasis, disease stage, histological grade, and poor OS through promotion of the degradation of MMP-9 [43,44]. The methylation patterns of FLNA in PCa have not been previously described. In our study, we found that FLNA was hypermethylated and downregulated in PCa, suggesting that aberrant methylation of FLNA in PCa may lead to the deregulation of this TSG, thereby impacting tumour development.
Core module analysis of the PPI network for the upregulated hypomethylated genes suggested that     [45][46][47]. In addition, in the GABAergic synapse pathway, GABA induces GRP secretion via GABBR1 in neuroendocrine-like cells, which is involved in PCa progression [48]. The cell cycle is a vital cellular process involving DNA replication and translation, and it tends to be deregulated in cancer [49]. However, the roles of purine metabolism and biosynthesis of amino acids in PCa, as well as the impact of aberrant methylation is unknown. In our study, the identified hypermethylated downregulated genes were enriched in cancer signalling and Rap1 signalling pathways. Cancer signalling pathways are fundamental for cancer development and sustained carcinogenesis. In addition, aberrant Rap1 activation leads to tumour progression, and it may be induced by cytokines such as galanin [50]. The specific manner in which aberrant methylation affects the functional roles of these pathways in PCa development and progression needs to be investigated in the future. There were several limitations of the present study. First, the study focused on upregulated hypomethylated and downregulated hypermethylated genes. However, contra-regulated genes were not included; these need to be considered in the future. Second, validation of the aberrantly methylated genes was carried out with TCGA data using in silico approaches. Biological experiments will be necessary to validate these findings in the future. Third, our study was limited to only 2 datasets. Therefore, larger sample sizes are needed to validate the findings of Fig. 10 Genetic alterations connected to the hub genes. a indicates the relationship between hub genes and OS. b illustrates a network constructed with our 4 hub genes, their 50 most frequently neighbouring genes, and drugs targeting the hub genes. c, d shows the alteration of the hub genes. e points out the correlations between mRNA expression and DNA methylation for the hub genes in the TCGA dataset (See figure on previous page.) Fig. 11 Validation of the hub genes using the Human Protein Atlas (HPA) database this study. Lastly, more experiments, such as qRT-PCR experiments comparing expression in PCa tissues and normal tissues, should be conducted to confirm the target genes. We have collected PCa tissues and normal tissues, and the results of further analyses will be presented in the future.

Conclusion
In summary, our results identified a series of aberrantly methylated differentially expressed oncogenes and TSGs and their associated pathways in PCa using integrated bioinformatic analysis of gene expression and gene methylation microarray datasets. These results may contribute to a more comprehensive understanding of the molecular mechanisms underlying the occurrence and development of PCa. The 4 hub genes found, namely, AKT1, PRDM10, FASN, and FLNA, were validated using the TCGA PRAD patient dataset. These genes, when aberrantly methylated, may serve as putative biomarkers for the precise diagnosis and treatment of PCa in the future. In comparison to other studies that have focused on individual datasets, our study analysed multiple datasets to produce more robust results regarding gene expression changes and gene modifications that are important in the development and progression of PCa. Future studies will be aimed at validating the functional significance of the identified hub genes in PCa.