- Primary research
- Open Access
Identification of DNA methylation-driven genes in esophageal squamous cell carcinoma: a study based on The Cancer Genome Atlas
Cancer Cell International volume 19, Article number: 52 (2019)
Aberrant DNA methylations are significantly associated with esophageal squamous cell carcinoma (ESCC). In this study, we aimed to investigate the DNA methylation-driven genes in ESCC by integrative bioinformatics analysis.
Data of DNA methylation and transcriptome profiling were downloaded from TCGA database. DNA methylation-driven genes were obtained by methylmix R package. David database and ConsensusPathDB were used to perform gene ontology (GO) analysis and pathway analysis, respectively. Survival R package was used to analyze overall survival analysis of methylation-driven genes.
Totally 26 DNA methylation-driven genes were identified by the methylmix, which were enriched in molecular function of DNA binding and transcription factor activity. Then, ABCD1, SLC5A10, SPIN3, ZNF69, and ZNF608 were recognized as significant independent prognostic biomarkers from 26 methylation-driven genes. Additionally, a further integrative survival analysis, which combined methylation and gene expression data, was identified that ABCD1, CCDC8, FBXO17 were significantly associated with patients’ survival. Also, multiple aberrant methylation sites were found to be correlated with gene expression.
In summary, we studied the DNA methylation-driven genes in ESCC by bioinformatics analysis, offering better understand of molecular mechanisms of ESCC and providing potential biomarkers precision treatment and prognosis detection.
Esophageal carcinoma (EC) is one of the most common malignant tumors in the digestive system. It occurs mostly in the esophageal epithelium, and there are no typical clinical symptoms in the early stage of the patient. Therefore, more than 80% of EC patients have progressed to the advanced stage when they are diagnosed, which affects the prognosis of patients . Esophageal cancer has two major histological subtypes, esophageal adenocarcinoma (EAC) and esophageal squamous cell carcinoma (ESCC). Among them, ESCC is the predominant subtype and accounts for 80% of all patients . For the moment, the mechanism of ESCC is still not fully characterized, and the early symptoms of patients are atypical, which brings great difficulties for clinical diagnosis and therapy . Similar to other malignancies, the progression of ESCC is also a complex process involving multiple factors and multiple gene mutations. Studies have shown that the changes in the molecular level of ESCC tissues are earlier than the clinical features. Therefore, early diagnosis and intervention are significant for reducing the incidence of ESCC .
Epigenetic changes are identified as significant contributors to cancer progression . The abnormal DNF methylation is one of the most important and common epigenetic modifications, and plays key roles in regulating genome function . Selective hypermethylation or hypomethylation of genes to regulate the expression of genes and form specific tissue types during development are considered to be a hallmark in developing many carcinomas . In recent years, studies on methylation and tumors have gradually drawn more attention. For instance, Roy et al. analyzed the lymph node metastasis in esophageal squamous cell carcinoma and built a comprehensive methylation signature for predicting the prognosis of patients . Genes including NNK, MSH3 and P16, which were stated to be methylated, associated with tumors progression [9,10,11]. Identification of abnormal methylated genes can explore the redundancy and instability of the esophageal carcinoma genome and provide the basis for risk prediction and targeted therapy.
The wide DNA methylation arrays and advent of deep RNA-Seq approach has significantly contributed to study the interaction between methylation and gene expression during tissue carcinogenesis and development. An integrative analysis of mRNA expression and DNA methylation studied by Kim et al. stated out the function of epigenetic changes on malignant mesothelioma cell . Furthermore, in order to identify the mechanism contributed to oncogenesis, Olivier Gevaert et al. developed a novel computational algorithm called Methylmix to study abnormal methylated genes and predict transcription . As a well-known cancer genome database, The Cancer Genome Atlas (TCGA)  provides a great genomic data with patients information, which can translate molecular information into potential clinical information. In this study, ESCC-related expressed and abnormally methylated genes were recognized based on TCGA database, and the related differential genes and expression of abnormally methylated genes in ESCC patients were clarified. We analyzed RNA-Seq transcriptomes and DNA methylation data of ESCC samples from 99 cases in TCGA. Five candidate genes (ABCD1, SLC5A10, SPIN3, ZNF69, ZNF608) were identified from 26 driven genes (p < 0.05), which could be served as independent prognostic biomarker. Additionally, ABCD1, CCDC8, FBXO17 were identified to be meaningfully correlated with prognosis by further integrative survival analysis. Besides, we found the significant correlation between methylated sites with gene expression.
Data acquisition and preprocessing
In this study, all data were obtained from TCGA data portal accessed on 20181108 (https://portal.gdc.cancer.gov/). Of them, the DNA methylation data was using the Illumina Infinium HumanMethylation450 platform, and beta values, ranged from 0 to 1, was quantified to indicate the levels of DNA methylation. The DNA methylation data included 3 normal samples, 96 ESCC samples. And we used transcriptome profiling data without isoform expression and miRNA expression quantification, for analyzing the gene expression of ESCC. Then, R software and packages were utilized to analyze and normalize the downloaded data to obtain differentially expressed genes (DEGs) and differentially methylated genes (DMGs). Furthermore, a total of 96 ESCC suffers had recorded clinical data and were used in further survival analysis (Additional file 1: Table S1). The data from TCGA is open-ended and publicly available.
The DEGs and DMGs were integrated for performing an analysis via the R package MethylMix . MethylMix is a program used for automatically analyzing the correlation between methylation events and gene expression . Three datasets are required as input for analysis: normal DNA methylation data, cancer DNA methylation data and matched gene expression data. Then, the Methylmix identify cancer specific hyper and hypo methylated genes, which named transcriptionally predictive genes, and compute the correlation between methylated genes and related genes. A Wilcoxon rank sum test was adopted in this algorithm. And the final output of MethylMix is genes that are both transcriptionally predictive and differentially methylated states. Additionally, the differential methylation (DM) value where a negative DM value signifies hypomethylation and a positive DM value signifies hypermethylation can be used in subsequent analysis.
Methylation-driven genes functional enrichment and pathway analysis
Gene ontology (GO) analysis was conducted on identified methylation-driven genes with methylation/expression using the DAVID database. DAVID provides integrative and systematic annotation tools for unraveling biological meaning of genes. Gene ontology (GO) analysis includes the molecular function, biological process and cellular component . And we used Goplot to visualize the result.
Pathway analysis was conducted for the methylation-driven genes with ConsensusPathDB , which is a functional molecular interaction database, integrating information on genetic interacting signaling, protein interacting, drug-target interactions, metabolism and gene regulation in humans. Over-representation analysis was based on neighbourhood entity sets or biochemical pathways, and the pathway analysis was performed on the basis of imputed gene list. Lists of hypomethylated genes and hypermethylated genes were analyzed together. We used p value cutoff of 0.05 and minimum overlap as default settings.
Kaplan–Meier curves were used to identify the relationship between methylation-driven genes and the survival in ESCC. The independent prognostic possibility of methylation-driven genes was screened via the survival R package. The p value was obtained using the long-rank test and p < 0.05 were considered statistically significant.
To further investigate the key genes from methylation-driven genes, we combined abnormal methylation genes with the corresponding gene expression data, and the joint survival analysis was performed via the survival R package. In addition, since the key genes were obtained from the above, we merged relevant sites of methylation and corresponding gene expression data, for identifying the correlation between gene expression and key gene methylation sites.
Identification of methylation-driven genes in ESCC
To study methylation-driven genes, a total of 3 normal samples and 96 sample of methylated from TCGA were included in our study. First, we used LIMMA software package for DMGs filtration (p < 0.05, |logFC ≥ 1|, and hypermethylation of 447 genes and hypomethylation of 520 genes were identified (Fig. 1). Second, edgR R package was used for identifying the DEGs in ESCC, and DEGs and DMGs were merged. Third, according to the Methylmix R package, we recombined the DEGs and DMGs and divided them into methylated cancer set, methylated normal set and gene cancer set. P < 0.05 and cor < − 0.3 were adopted for screening methylation-driven genes. Last, 26 genes were screened and we used R software to visualized the mixture model and the correlation between genes expression and degree of methylation (Table 1). Among them, 4 genes were shown in Figs. 2 and 3, while the rest were shown in Additional file 2: Figure S1 and Additional file 3: Figure S2. Furthermore, there were no significant differences between these four genes (p > 0.05).
Functional enrichment and pathway analysis of methylation-driven genes
To further investigate the function of methylation-driven genes in ESCC, we used GO enrichment analysis in DAVID. Methylation-driven genes were enriched in molecular function (MF) of DNA binding and transcription factor activity. As for cell component (CC), these genes showed enrichment in nonmotile primary cilium. Besides, biological process (BP) indicated enrichment predominantly at regulation of RNA metabolic process (Fig. 4a).
Pathway enrichment analysis revealed that methylation-driven genes were significantly linked to vitamin D receptor pathway, development and heterogeneity of the ILC family, adipogenesis and G alpha (i) signaling events (Fig. 4b).
Prognostic assessment of methylation-driven genes in ESCC
The prognostic value of 26 methylation-driven genes was assessed by Survival R package, and we found five genes (ABCD1, SLC5A10, SPIN3, ZNF69, and ZNF608) were independent prognostic indicators for ESCC (Fig. 5). However, to further investigate the correlation between genes methylation and expression, we combined these data to study the influence on patients’ survival. Using p < 0.05 as a significant standard for integrative survival, the gene expression and methylation levels of the prognostic genes ABCD1, CCDC8, FBXO17 were meaningfully correlated with prognosis (Fig. 6). Also, the prognosis-related genes methylation sites based on corresponding data in TCGA were identified, and the correlation between genes expression and sites were analyzed (Table 2). The gene expression of ABCD1, CCDC8 and FBXO17 were identified to be correlated with the methylation level of multiple sites, and all of them showed negative correlations (Figs. 7, 8).
Esophageal carcinoma is one of the most common malignant tumors of the digestive system with high mortality and poor prognosis . Esophageal adenocarcinoma (EAC) and esophageal squamous cell carcinoma (ESCC) are the major histological subtype of esophageal cancer. Alcohol consumption and tobacco smoking are two main risk factors in ESCC [19, 20], while obesity, diet and gastroesophageal reflux disease (GERD) were considered as risk indicators for EAC [21, 22]. Despite chemoradiotherapy or surgery, the prognosis of esophageal carcinoma remains poor with the overall survival . The mechanism of ESCC is still unclear. Therefore, a further study of ESCC and subsequent therapeutic advances are urgently needed. Both epigenetic and genetic aberrations have been identified to ESCC generation and progression. With the rapid development of gene analysis technology, we can further study the molecular characteristics of ESCC, which provides valuable evidence for prognosis and therapeutic molecular targets.
Recently, the study on the relationship between epigenetics and tumorigenesis is always one of hotspots in the molecular biology. Epigenetics are different in that nucleotide sequences have not changed and play roles via DNA methylation, chromosome remodeling and histone deacetylation. Many studies have shown that DNA methylation is correlated with human ESCC. Aberrant DNA methylation of genes can be served as noninvasive biomarkers for the diagnosis and detection of cancer [24, 25]. Therefore, to investigate the epigenetic changes and the molecular mechanisms of ESCC progression that determines promising biomarkers, early diagnosis, treatment of ESCC is significant. The stability and independence of aberrant methylated DNA analysis make it a feasible approach for prognostic biomarkers . Several reports have shown that the aberrant methylation of DNA affects genes involved in DNA damage, cell cycle, Wnt, NF-κB signaling pathways, including MGMT , P16 , DACH1  and ZNF382 . Also, other studies have shown that methylated FHIT is correlated with poor prognosis in early ESCC . Therefore, bioinformatics analysis of the molecular functional enrichment and prognostic value of aberrant methylation DNA can offer clinicians with promising tools to predict prognosis and treat patients.
In our study, we investigated aberrant methylated genes between normal samples and ESCC patients to identify the biomarkers of prognosis related to methylation-driven genes. A model-based tool (methylmix) was used to identify those genes with abnormal methylation and correlation with gene expression, and 26 methylation-driven genes were found . To study the functional roles of these ESCC methylation-driven genes, gene ontology (GO) and pathway analysis were performed. As was revealed by DAVID database, methylation-driven genes in ESCC were enriched in molecular function (MF) of DNA binding and transcription factor activity. As for cell component (CC), these genes showed enrichment in nonmotile primary cilium. Besides, biological process (BP) indicated enrichment predominantly at regulation of RNA metabolic process. These functional items not only showed the interaction of genes at the functional level but also revealed the aberration of genes function may result from abnormally methylated DNA in different samples.
In order to study further of the relationship between methylation-driven genes and patients, survival R package was utilized to analyze the correlation between abnormal DNA methylation and patients survival. Five candidate genes (ABCD1, SLC5A10, SPIN3, ZNF69, ZNF608) were identified from 26 driven genes (p < 0.05), and they might be served as independent prognostic factor for ESCC. However, it was still not comprehensive for just analyzing aberrant methylation data with patients’ survival. Thus, we moved on to combine abnormal methylation genes and the corresponding gene expression data with patients’ survival for integrative survival analysis. In the result, ABCD1, CCDC8, FBXO17 were identified to be meaningfully correlated with prognosis. Previous studies have suggested CCDC8 (coiled-coil domain containing 8) was frequently epigenetically dysregulated in renal cell carcinoma and in breast carcinomas that metastasis to the brain [31, 32]. Also, FBXO17 (F-box protein 17) have been identified to be hypermethylated in salivary gland adenoid cystic tumor . For these specific genes, we further studied the correlation between expression level with methylation level of the sites, and we found multiple sites were negatively correlated with the gene expression level. The result may due to aberrant methylation of the sites leading to the dysregulation of the expression, which affects the generation and progression of cancers and the prognosis of patients.
Growing evidence demonstrated that the aberrant DNA methylation was associated tumors generation and progression via the bioinformatics analysis. For instance, Gao et al. found a prognostic risk model for evaluating the prognosis of LUSC patients, and they studied the abnormal methylated sites of key genes which had poor prognosis with patients . Also, Fan et al. used GEO database to study aberrant methylation genes as biomarkers for hepatocellular cancer . At present, the abnormally methylated genes in ESCC still have not been studied. Compared to previous studies, we used methylmix as a technology, which provided a more comprehensive analysis for screening methylation-driven genes in ESCC. For transferring the result to practical application, we studied the methylation driven-genes which were significant with patients’ survival. Furthermore, the correlation between abnormally methylated sites and gene expression was analyzed for providing a more precise target for further experimental validation. Although we have made comprehensive study correlated with epigenetics changes and ESCC, the experiments are still significant to testify its specificity and sensitivity.
In summary, we found DNA methylation-driven genes involved in ESCC generation and progression by using methylmix technology. On this basis, we further studied the driven-genes related to patients’ survival. In the result, ABCD1, SLC5A10, SPIN3, ZNF69, ZNF608 were identified and can be served as independent prognostic factors for ESCC. ABCD1, CCDC8 and FBXO17 were screened out by the integrative survival analysis, and multiple methylated sites were correlated with gene expression. Those aberrant methylated genes may contribute to reveal the mechanisms of ESCC generation and progression and can be served as promising biomarkers for diagnosis, treatment and prognosis. Further characterization of the DNA methylated changes can help to figure out the mechanisms and design improved existing treatment.
The Cancer Genome Atlas
esophageal squamous cell carcinoma
differentially expressed gene
differentially methylated gene
gene expression omnibus
Testa U, Castelli G, Pelosi E. Esophageal cancer: genomic and molecular characterization, stem cell compartment and clonal evolution. Medicines. 2017;4(3):67.
Ohashi S, Miyamoto S, Kikuchi O, Goto T, Amanuma Y, Muto M. Recent advances from basic and clinical studies of esophageal squamous cell carcinoma. Gastroenterology. 2015;149(7):1700–15.
Huang FL, Yu SJ. Esophageal cancer: risk factors, genetic association, and treatment. Asian J Surg. 2016;41(3):210–5.
Kuwano H, Nishimura Y, Oyama T, Kato H, Kitagawa Y, Kusano M, Shimada H, Takiuchi H, Toh Y, Doki Y. Guidelines for diagnosis and treatment of carcinoma of the esophagus April 2012 edited by the Japan Esophageal Society. Esophagus. 2015;12(1):1–30.
Jones PA, SB SB. The epigenomics of cancer. Cell. 2007;128(4):683–92.
Shi B, Thomas AJ, Benninghoff AD, Sessions BR, Meng Q, Parasar P, Rutigliano HM, White KL, Davies CJ. Genetic and epigenetic regulation of major histocompatibility complex class I gene expression in bovine trophoblast cells. Am J Reprod Immunol. 2018;79(1):e12799.
Sandoval J, Esteller M. Cancer epigenomics: beyond genomics. Curr Opin Genet Dev. 2012;22(1):50–5.
Roy R, Kandimalla R, Sonohara F, Koike M, Kodera Y, Takahashi N, Yamada Y, Goel A. A comprehensive methylation signature identifies lymph node metastasis in esophageal squamous cell carcinoma. Int J Cancer. 2018;144:1160–9.
Vuillemenot BR, Hutt JA, Belinsky SA. Gene promoter hypermethylation in mouse lung tumors. Mol Cancer Res (MCR). 2006;4(4):267.
Vogelsang M, Paccez JD, Schäfer G, Dzobo K, Zerbini LF, Parker MI. Aberrant methylation of the MSH3 promoter and distal enhancer in esophageal cancer patients exposed to first-hand tobacco smoke. J Cancer Res Clin Oncol. 2014;140(11):1825.
Mohammad GS. Associations of risk factors obesity and occupational airborne exposures with CDKN2A/p16 aberrant DNA methylation in esophageal cancer patients. Dis Esophagus. 2010;23(7):597–602.
Kim MC, Kim NY, Seo YR, Kim Y. An integrated analysis of the genome-wide profiles of dna methylation and mrna expression defining the side population of a human malignant mesothelioma cell line. J Cancer. 2016;7(12):1668–79.
Gevaert O, Tibshirani R, Plevritis SK. Pancancer analysis of DNA methylation-driven genes using MethylMix. Genome Biol. 2015;16(1):17.
Katarzyna T, Patrycja C, Maciej W. The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemp Oncol. 2015;19(1A):68–77.
Olivier G. MethylMix: an R package for identifying DNA methylation-driven genes. Bioinformatics. 2015;31(11):1839–41.
Consortium GO. The Gene Ontology (GO) project in 2006. Nucleic Acids Res. 2006;34(Database issue):322–6.
Kamburov A, Stelzl U, Lehrach H, Herwig R. The ConsensusPathDB interaction database: 2013 update. Nucleic Acids Res. 2013;41(Database issue):793.
Short MW, Burgers KG, Fry VT. Esophageal cancer. Am Fam Physician. 2017;95(1):22–8.
Chien-Hung L, Deng-Chyang W, Jang-Ming L, I-Chen W, Yih-Gang G, Ein-Long K, Hsiao-Ling H, Te-Fu C, Shah-Hwa C, Yi-Pin C. Carcinogenetic impact of alcohol intake on squamous cell carcinoma risk of the oesophagus in relation to tobacco smoking. Eur J Cancer. 2007;43(7):1188–99.
Vaughan TL, Davis S, Kristal A, Thomas DB. Obesity, alcohol, and tobacco as risk factors for cancers of the esophagus and gastric cardia: adenocarcinoma versus squamous cell carcinoma. Cancer Epidemiol Biomarkers Prev. 1995;4(2):85–92.
Veugelers PJ, Porter GA, Guernsey DL, Casson AG. Obesity and lifestyle risk factors for gastroesophageal reflux disease, Barrett esophagus and esophageal adenocarcinoma. Dis Esophagus. 2010;19(5):321–8.
Lagergren J, Bergström R, Lindgren A, Nyrén O. Symptomatic gastroesophageal reflux as a risk factor for esophageal adenocarcinoma. Dig Dis Sci. 2000;45(12):2367–8.
Enzinger PC, Mayer RJ. Esophageal cancer. N Engl J Med. 2003;349(23):2241.
Gloss BS, Goli S. Epigenetic biomarkers in epithelial ovarian cancer. Cancer Lett. 2014;342(2):257–63.
Chen ZY, Zhang JL, Yao HX, Wang PY, Zhu J, Wang W, Wang X, Wan YL, Chen SW, Chen GW. Aberrant methylation of the SPARC gene promoter and its clinical implication in gastric cancer. Sci Rep. 2014;4(4):7035.
Dinardo CD, Luskin MR, Carroll M, Smith C, Harrison J, Pierce S, Kornblau S, Konopleva M, Kadia T, Kantarjian H. Validation of a clinical assay of multi-locus DNA methylation for prognosis of newly diagnosed AML. Am J Hematol. 2016;92(2):E14.
Jia-Jun Z, Hong-Yu L, Di W, Hui Y, Da-Wei S. Abnormal MGMT promoter methylation may contribute to the risk of esophageal cancer: a meta-analysis of cohort studies. Tumor Biol. 2014;35(10):10085–93.
Wu L, Herman JG, Brock MV, Wu K, Mao G, Yan W, Nie Y, Liang H, Zhan Q, Li W. Silencing DACH1 promotes esophageal cancer growth by inhibiting TGF-β signaling. PLoS ONE. 2014;9(4):e95509.
Yingduan C, Hua G, Suk Hang C, Pei L, Yan B, Jisheng L, Gopesh S, Ng MHL, Tatsuo F, Xiushan W. KRAB zinc finger protein ZNF382 is a proapoptotic tumor suppressor that represses multiple oncogenes and is commonly silenced in multiple carcinomas. Cancer Res. 2010;70(16):6516–26.
Eun JuL, Bin LB, Wook KJ, Young Mog S, Hoseok I, Joungho H, Eun Yoon C, Joobae P, Duk-Hwan K. Aberrant methylation of Fragile Histidine Triad gene is associated with poor prognosis in early stage esophageal squamous cell carcinoma. Eur J Cancer. 2006;42(7):972–80.
Pangeni RP, Huen DS, Eagles LW, Johal BK, Pasha D, Hadjistephanou N, Nevell O, Davies CL, Adewumi AI, Khanom H, et al. The GALNT9, BNC1 and CCDC8 genes are frequently epigenetically dysregulated in breast tumours that metastasise to the brain. Clin Epigenet. 2015;7(1):57.
Morris MR, Ricketts CJ, Gentle D, Mcronald F, Carli N, Khalili H, Brown M, Kishida T, Yao M, Banks RE. Genome-wide methylation analysis identifies epigenetically inactivated candidate tumour suppressor genes in renal cell carcinoma. Oncogene. 2011;30(12):1390.
Achim B, Diana B, Weber RS, El-Naggar AK. CpG island methylation profiling in human salivary gland adenoid cystic carcinoma. Cancer. 2011;117(13):2898–909.
Gao C, Zhuang J, Zhou C, Ma K, Zhao M, Liu C, Liu L, Li H, Feng F, Sun C. Prognostic value of aberrantly expressed methylation gene profiles in lung squamous cell carcinoma: a study based on The Cancer Genome Atlas. J Cell Physiol. 2019;234(5):6519–28.
Tu Y, Chen C, Sun H, Wan C, Cai X. DNA methylation biomarkers for hepatocellular carcinoma. Cancer Cell Int. 2018;18(1):140.
TL, SCM and DC collected the data; YYW, SCL and YW performed the statistical analysis; YTD, XLL and WXD prepared the figures and tables; TL, XS and WJJ conceived the study. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Availability of data and materials
The authors declare that the data supporting the findings of this study are available in the TCGA database. (https://portal.gdc.cancer.gov/).
Consent for publication
Ethics approval and consent to participate
This work was supported by Shandong Provincial Natural Science Foundation, China (CN) (No. zr2016hm58).
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Lu, T., Chen, D., Wang, Y. et al. Identification of DNA methylation-driven genes in esophageal squamous cell carcinoma: a study based on The Cancer Genome Atlas. Cancer Cell Int 19, 52 (2019). https://doi.org/10.1186/s12935-019-0770-9
- Esophageal squamous cell carcinoma