Identification of candidate genes or microRNAs associated with the lymph node metastasis of SCLC

Background Small cell lung cancer (SCLC) is a highly malignant cancer, and over 70% of patients with SCLC present with the metastatic disease. We aimed to explore some novel differentially expressed genes (DEGs) or microRNAs (miRNAs) associated with the lymph node metastasis of SCLC. Methods The DEGs between the metastasis and cancer groups were identified, and GO functional and KEGG pathway enrichment analyses for these DEGs were implemented. Subsequently, the protein–protein interaction network and subnetwork of module were constructed. Then the regulatory networks based on miRNAs, transcription factors (TFs) and target DEGs were constructed. Ultimately, the survival analysis for DEGs was performed to obtain the DEGs related to the survival of SCLC. Results Here, 186 upregulated (e.g., GSR, HCP5) and 144 downregulated DEGs (e.g., MET, GRM8, and DACH1) were identified between the SCLC patients with lymph node metastasis and without lymph node metastasis. GRM8 was attracted to the G-protein coupled receptor signaling pathway. Besides, miR-126 was identified in the miRNAs-TFs-target regulatory network. GRM8 and DACH1 were all regulated by miR-126. In particular, GSR and HCP5 were correlated with survival of SCLC patients. Conclusion MiR-126, DACH1, GRM8, MET, GSR, and HCP5 were implicated in the lymph node metastasis process of SCLC.


Background
Lung cancer (LC) is a malignant lung tumor characterized by unbounded cell growth in the lung tissues [1]. It is estimated that there are approximately 4,291,600 new cancer cases in China in 2015, and LC is still the main factor for cancer-associated death [2,3]. Currently, LCs are frequently divided into small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC), and 10-15% of LCs are SCLC [4,5]. SCLC, a poorly differentiated and aggressive type of LC, presents an early metastases, fleetly growth rate, and poor prognosis with a lower overall 5-year survival rate [6][7][8]. However, the molecular determinants of SCLC metastasis are unclear. Thus, it is essential to explore the determinants to prevent SCLC metastasis.
Lymph nodes, the central trafficking hubs for recirculating immune cells, are widely present throughout the body [9]. Conceivably, tumor cells could migrate into the lymph nodes and rapidly spread to other organs [10]. Activator protein-1 (AP-1), a transcription factor (TF), regulates the gene expression in response to various stimulus [11]. It has been reported that the overexpression of AP-1 is related to the lymphatic metastasis of LC [12]. Intriguingly, the abnormal expression of genes regulated by AP-1 was also involved in the process of lymphatic metastasis. For example, previous studies found that the overexpressions of urokinase type plasminogen activator (u-PA) and u-PA receptor (u-PAR) were correlated with the lymphatic metastasis of LC [13,14]. In particular, the overexpression of AP-1 contributes to the overexpressions of u-PA and u-PAR. Recently, other studies demonstrated that cyclooxygenase-2 (COX-2) overexpression is well related to the lymphatic metastasis of LC [15,16]. In the promoter regions of COX-2 genes, there is a binding site of AP-1 [17]. Furthermore, in metastatic lymph nodes, the vascular endothelial growth factor C (VEGF-C) overexpression is closely correlated with the lymph node metastasis of NSCLC [18]. These all findings revealed that the abnormal expression of TFs or genes were associated with the lymphatic metastasis of LC, especially for the NSCLC. However, yet little is known about the processes of tumor cell migration and lymph node metastasis in SCLC [19]. Therefore, we aimed to explore some novel differentially expressed genes (DEGs) related to the lymph node metastasis process of SCLC, and the potential mechanism would be elucidated.
The bioinformatics analysis methods were carried out for screening DEGs correlated with the lymph node metastasis process of SCLC. Firstly, the DEGs between the metastasis and cancer groups were screened. Afterwards, Gene Ontology (GO) functional and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses for the DEGs were implemented to obtain the potential functions of DEGs. Afterwards, the protein-protein interaction (PPI) network and subnetwork of module were established. Then the regulatory networks based on microRNA (miRNAs), TFs and target DEGs were constructed. Ultimately, the survival analysis for DEGs was performed to obtain the DEGs related to the survival of SCLC.

Data preprocessing and DEGs screening
We downloaded the raw CEL data and used the Oligo package (ver.1.38.0) (http://bioco nduct or.org/help/searc h/index .html?q=oligo /) [21] in R language to pre-process all the data by performing background correction, conversion of original data and quartile data normalization.
In order to remove the probes that cannot match the gene symbol, probes were annotated by the annotations file. The average value of different probes would serve as the final expression level of gene if different probes were mapped to the same gene symbol. DEGs were screened via the classical Bayesian method provided by limma package (ver. 3.30.13, http://www.bioco nduct or.org/ packa ges/2.9/bioc/html/limma .html) [22]. The setting of thresholds was p value < 0.05 and |log fold change (FC)| ≥ 1.5.

Functional and pathways enrichment analyses
GO [23] and KEGG pathway [24] analyses for DEGs were implemented utilizing the Database for Annotation, Visualization and Integration Discovery (DAVID) (ver. 6.8, https ://david -d.ncifc rf.gov/) [25] tool. The number of enrichment genes (count number) ≥ 2 and p value < 0.05 were regarded as the thresholds criteria.

PPI network and subnetwork of module analyses
The Search Tool for the Retrieval of Interacting Genes (STRING) (ver. 10.5, http://www.strin g-db.org/) [26] database was carried out to analyze the protein-protein interactions of DEGs. The DEGs acted as the input gene set, while the homo sapiens served as species. The PPI score was set as 0.4. Thereafter, the Cytoscape (ver. 3.6.0, http://www.cytos cape.org/) software was applied to construct the PPI network. The significant clustering modules were analyzed using Molecular Complex Detection (MCODE) (ver. 1.5.1, http://apps.cytos cape.org/ apps/MCODE ) [27] plugin. The threshold value was set as score ≥ 5.

MiRNAs-TFs-target regulatory network analyses
The iRegulon (ver. 1.3, http://apps.cytos cape.org/apps/ iRegu lon) [28] plugin in Cytoscape was performed to predict and analyze the interaction pairs of TF-target gene in the PPI network. The parameters were set as follows: the minimum identity value among orthologous genes was set as 0.05, and the maximum false discovery rate on motif similarity was set as 0.001. The higher score of Normalized Enrichment Score (NES) in output results presented the more reliable results. The TF-target interaction pairs whose NES > 4 were selected for further study. Afterwards, the miRNAs-target were predicted on the basis of WebGestal (http://www.webge stalt .org/optio n.php) using the Overrepresentation Enrichment Analysis (ORA) method. The setting of threshold was count number ≥ 2 and p value < 0.05. Ultimately, the miRNAs-TFs-target regulatory network was constructed utilizing the Cytoscape software.

Survival analysis
GSE29016 gene expression profiling data including 20 SCLC samples and clinical data were obtained from the GEO database (http://www.ncbi.nlm.nih.gov/geo/). The common samples between the SCLC samples and clinical data were screened and removed the samples with survival time less than 1 month. Here, a total of 14 samples were enrolled in the present study.
The sample informations were filtered, and the samples were deleted if the survival time was < 1 month. The samples corresponding to DEGs were screened. The survival package (ver. 2.41-3) in R language and median grouping method were used to conduct the survival analysis. Finally, DEGs under p value < 0.05 were selected to generate the survival curve.

Identification of DEGs
We obtained 330 DEGs between the metastasis and cancer groups, of these, 186 were upregulated and 144 were downregulated. The volcano map and heatmap of DEGs were presented in Fig. 1. Here, our results presented that the expressions of MET proto-oncogene, receptor tyrosine kinase (MET), glutamate metabotropic receptor 8 (GRM8), cholinergic receptor nicotinic alpha 5 subunit (CHRNA5), and dachshund family transcription factor 1 (DACH1) were reduced in the SCLC patients with lymph node metastasis compared with those patients without lymph node metastasis. Besides, glutathione-disulfide reductase (GSR), human leukocyte antigen complex P5 (HCP5), and achaete-scute family bHLH transcription factor 1 (ASCL1) were upregulated in the SCLC patients with lymph node metastasis.

PPI network and module analyses
There were 178 nodes and 237 interaction pairs in the PPI network (Fig. 2). Afterwards, one subnetwork module a (score = 5) with 5 nodes and 10 interaction pairs was obtained through the MCODE (score ≥ 5) plugin in Cytoscape software. According to the degree of DEGs in the PPI network, the top 10 DEGs were selected, then the GO-BP analysis for the top 10 DEGs and module a DEGs were implemented. The top 10 DEGs in the PPI network and DEGs in the module a are presented in Table 2. The enriched functions for the DEGs in the PPI network were shown in Table 3, such as homeostatic process (GO_BP; p value = 5.44 × 10 −5 ), regulation of phosphorylation (GO_ BP; p value = 1.53 × 10 −4 ), and regulation of phosphate metabolic process (GO_BP; p value = 1.78 × 10 −4 ). Meanwhile, the enriched functions for module a DEGs were listed in Table 3, such as G-protein coupled receptor protein signaling pathway (GO_BP; p value = 2.14 × 10 −3 ), cell surface receptor linked signal transduction (GO_BP; p value = 9.26 × 10 −3 ), and regulation of inflammatory response (GO_BP; p value = 2.23 × 10 −2 ). Here, our results showed that GRM8 was attracted to the G-protein coupled receptor signaling pathway.

MiRNAs-TFs-target regulatory network analyses
The miRNAs-TFs-target regulatory network was established with 8 TFs, 10 miRNAs and 187 DEGs through the Cytoscape software (Fig. 3). MiR-126 was identified in the miRNAs-TFs-target regulatory network. A total of 11 genes were regulated by miR-126 in our study, such as GRM8 and DACH1 (Table 4).

Survival analysis
Total 164 DEGs have the corresponding sample information. Hence, the 164 DEGs were used for generating the survival curve. There were 2 DEGs correlated with the survival of SCLC, such as GSR and HCP5 (Fig. 4).

Discussion
SCLC is a highly malignant cancer, and over 70% of patients with SCLC present with the metastatic disease [5]. But the molecular determinants of SCLC metastasis are unknown. In the current study, some novel DEGs and miRNA associated with the lymph node metastasis of SCLC were obtained via the comprehensive bioinformatical analyses, such as miR-126, DACH1, GRM8, MET, RSD and HCP5. In SCLC, more than 95% of patients have a smoking history and their 5-year survival rates are under 2% [29]. As known, a variety of addictive compound nicotine was contained in tobacco, which begins with the binding of nicotine to the nicotinic acetylcholine receptor (nAChR). In especial, the nAChR gene cluster encoding the α3, α5 and β4 nAChR subunits such as CHRNA5/A3/B4 was differentially expressed in SCLC [30]. In addition, ASCL1 which is a transcription factor implicated in the pathogenesis of SCLC is overexpressed in SCLC [31]. Similarly, CHRNA5 and ASCL1 were differentially expressed in SCLC with the metastatic disease. Interestingly, ASCL1 might regulate the expression of the clustered nAChR genes. Therefore, positive control genes have validated that the analysis pipeline is doable.
In general, lymph node metastasis of LC is positively related to lymphangiogenesis [32]. Currently, there is no direct proofs present that miR-126 is significantly associated with the lymph node metastasis of SCLC. However, Sasahira et al. found that the downregulated miR-126 was correlated with the induction of lymphangiogenesis in the OSCC [33]. It has been uncovered that miR-126 is downregulated in primary SCLC tumor samples [34]. In addition, miR-126 has an negative effect on the growth and proliferation of SCLC cells [35]. Meanwhile, miR-126 is also a crucial regulator for the vessel development [36]. Here, miR-126 was identified in the miRNAs-TFs-target regulatory network. These findings all indicated that miR-126 is likely to participate in the lymph node metastasis process of SCLC through inducing the lymphangiogenesis. A total of 11 genes were regulated by miR-126 in our study, such as GRM8 and DACH1. DACH1, implicated in the suppression of tumor growth, is downregulated in human malignancies, such as LC [37]. It is reported that the LC invasion and tumor growth can be inhibited by DACH1 through suppressing the CXCL5 signaling [38]. Here, our results presented that DACH1 expression was downregulated in the SCLC patients with lymph node metastasis. Probably, DACH1 decrease inhibits the lymph node metastasis process of SCLC. GRM8, encoded by the GRM8 gene, are a family of G protein-coupled receptors. Here, our results showed that GRM8 was downregulated in the SCLC patients with lymph node metastasis and was attracted to the G-protein coupled receptor signaling pathway. However, the role of GRM8 in the lymph node metastasis process of SCLC remains unclear. Previous studies indicated that signaling pathways controlled by GPCRs facilitate proliferation, cell migration, angiogenesis, and inflammation [39]. Namely, GPCRs are likely to promote the migration of LC cells into the lymph node. Therefore, we speculated that GRM8 was likely to participate in the lymph node metastasis process of SCLC.
The gene expression programs are controlled by a variety of TFs, and its misregulation result in various diseases [40]. Briefly, the transcriptional misregulation can cause thousands of diseases. Here, a total of 6 DEGs were attracted to the pathway of transcriptional misregulation in cancer, such as MET. It has been revealed that MET regulates the remodeling and morphogenesis of tissues, and its dysregulation is implied in the oncogenic signaling and metastasis [41,42]. Here, MET was regulated in the SCLC patients with lymph node metastasis. Thus, MET dysregulation may participate in the  lymph node metastasis process of SCLC through the transcriptional misregulation in cancer pathway. GSR encoded a member of the class-I pyridine nucleotide-disulfide oxidoreductase family, which played a key role in cellular antioxidant defense. GSR reduced oxidized glutathione disulfide (GSSG) to the sulfhydryl form glutathione (GSH). Whereas, GSH played complex roles in cancer, including the protective and pathogenic roles [43]. In a previous study, GSR expression level was significantly increased in tumor tissue from patients with LC [44]. Although there was no direct proofs to indicate GSR association with the lymph node metastasis process of SCLC, the glioblastoma multiforme patients with high GSR expression showed poor survival [45]. Similarly, GSR was up-regulated in the patients with lymph node metastasis process of SCLC and displayed a poor survival results. Therefore, we speculated that GSR might have a central role in the lymph node metastasis process of SCLC.
In addition, HCP5 was significantly down-regulated in patients with ovarian cancer [46]. Similarly, HCP5 was also down-expressed in patients with lung adenocarcinoma. However, Teng et al. reported that HCP5 was upregulated in glioma tissues as well as in U87 and U251 cells [47]. In addition, they found that HCP5 regulated  the glioma cells malignant proliferation through binding to miR-139 by up-regulating RUNX1. Probably, HCP5 expression was different in various cancers. In the present study, HCP5 was upregulated in the lymph node metastasis process of SCLC. Hence, we speculated that GSR and HCP5 may involve in the lymph node metastasis process of SCLC. However, the predicted results cannot be verified by laboratory data due to the limitation of sample extraction. In further studies, we will confirm the expressions of the above discussed DEGs and miRNA once we collected the sufficient samples.

Conclusion
In summary, our results suggested that miR-126 and its target gene DACH1 may implicate in the lymph node metastasis process of SCLC. Additionally, GRM8, MET, RSD and HCP5 were implicated in the lymph node metastasis process of SCLC.  The survival curve for the GSR (a) and HCP5 (b). The horizontal axis represents the survival time (months), and the vertical axis presents the survival rate. The red curve stands for the group of upregulated gene expression (high expression), and the black curve represents the group of downregulated gene expression (low expression). A p value of < 0.05 was condidered statistically significant between upregulated and downregulated gene expression. GSR, glutathione-disulfide reductase; HCP5, human leukocyte antigen complex P5