Effect of hyperoside on cervical cancer cells and transcriptome analysis of differentially expressed genes

Background Hyperoside (Hy) is a plant-derived quercetin 3-d-galactoside that exhibits inhibitory activities on various tumor types. The objective of the current study was to explore Hy effects on cervical cancer cell proliferation, and to perform a transcriptome analysis of differentially expressed genes. Methods Cervical cancer HeLa and C-33A cells were cultured and the effect of Hy treatment was determined using the Cell Counting Kit-8 (CCK-8) assay. After calculating the IC50 of Hy in HeLa and C-33A cells, the more sensitive to Hy treatment cell type was selected for RNA-Seq. Differentially expressed genes (DEGs) were identified by comparing gene expression between the Hy and control groups. Candidate genes were determined through DEG analysis, protein interaction network (PPI) construction, PPI module analysis, transcription factor (TF) prediction, TF-target network construction, and survival analysis. Finally, the key candidate genes were verified by RT-qPCR and western blot. Results Hy inhibited HeLa and C33A cell proliferation in a dose- and time-dependent manner, as determined by the CCK-8 assay. Treatment of C-33A cells with 2 mM Hy was selected for the subsequent experiments. Compared with the control group, 754 upregulated and 509 downregulated genes were identified after RNA-Seq. After functional enrichment, 74 gene ontology biological processes and 43 Kyoto Encyclopedia of Genes and Genomes pathways were obtained. According to the protein interaction network (PPI), PPI module analysis, TF-target network construction, and survival analysis, the key genes MYC, CNKN1A, PAX2, TFRC, ACOX2, UNC5B, APBA1, PRKACA, PEAR1, COL12A1, CACNA1G, PEAR1, and CCNA2 were detected. RT-qPCR was performed on the key genes, and Western blot was used to verify C-MYC and TFRC. C-MYC and TFRC expressions were lower and higher than the corresponding values in the control group, respectively, in accordance with the results from the RNA-Seq analysis. Conclusion Hy inhibited HeLa and C-33A cell proliferation through C-MYC gene expression reduction in C-33A cells and TFRC regulation. The results of the current study provide a theoretical basis for Hy treatment of cervical cancer.

Background Cervical cancer is a malignant epithelial tumor that occurs in the cervix. Most cervical cancers can be screened early by cervical cytology and virology. Moreover, human papillomavirus (Hpv) vaccination has emerged as an effective method for cervical cancer prevention [1]. However, due to inadequate screening programs in many parts of the world, cervical cancer remains one of the most common cancer types in females [2,3]. Surgery is the main method for early treatment. Radiotherapy and chemotherapy are further therapy options. Women with cervical, especially advanced or recurrent, cancer are commonly treated using chemotherapy [4]. Recently, several reports have implicated traditional Chinese medicines in the treatment of cervical cancer. For example, ferulic acid inhibits the proliferation, invasion, and autophagy of cervical cancer cells, and induces cell cycle arrest [5]. Moreover, casticin induces G0/G1 cell cycle arrest and apoptosis in gallbladder cancer [6]. Hyperoside (Hy) is a flavonoid found mainly in Chinese herbal medicines. It exhibits anti-inflammatory, anti-oxidative, and vascular protective effects. Several recent studies demonstrated an anticancer effect of Hy in a variety of tumor types. Thus, Hy increased apoptosis and autophagy in pancreatic cancer cells [7]. Another study described Hy-mediated inhibition of human osteosarcoma cell proliferation and promotion of osteogenic differentiation [8]. Yet another study implicated Hy in the caspase-3, p53, and nuclear factor-kappa B (NF-κB) signaling pathways, which induce apoptosis and inhibit lung cancer cell proliferation [9,10]. In gynecological oncology, Hy induces endometrial cancer cell apoptosis through the mitochondrial pathway [11]. However, Hy effect on cervical cancer development and the molecular mechanism implicated are unclear.
In the current study, the effect of Hy on two cervical cancer cell lines was determined using cytological methods, to detect changes in the cell proliferation index. Differentially expressed genes (DEGs) were identified by RNA sequencing (RNA-seq), comparing untreated and Hy-treated cells. Further analyses of the DEGs were conducted to explore the specific mechanism of Hy action on cervical cancer cells.

Cell culture
HeLa and C-33A cells (both acquired from the Chinese Academy of Sciences Shanghai Cell Bank) were cultured in Dulbecco's modified Eagle's medium (Gibco,Waltham, MA, USA) supplemented with 10% fetal bovine serum. They were inoculated in 96-well plates, cultured at 37 °C for 24 h, and then divided into seven groups. One group was untreated, whereas the remaining groups were treated with 0.25, 0.5, 1, 2, 4, or 8 mM Hy (Solarbio, Beijing, China) for 24, 48, or 72 h.

Cell viability and IC50 determination
Cell viability was determined using the Cell Counting Kit-8 (CCK-8) assay at 24, 48, and 72 h. At each time point, 100 µL CCK-8 (Beyotime Bio, Shanghai, China) was added to each well of a 96-well plate, which was then placed in a 37 °C, 5% CO 2 incubator. HeLa and C-33A cells were incubated for 0.5 and 2 h, respectively, in the dark. The absorbance of each well was measured at 450 nm using an EPOCH microplate reader (Gene Company Limited). The half-inhibitory concentration (IC50) was calculated with GraphPad (version 5.0), and the cell line exhibiting higher sensitivity to Hy treatment was selected for follow-up experiments.

RNA-Seq and data preprocessing
Cells were divided into two groups for this experiment: a Hy-treated and a blank control group; the experiment was repeated three times with independent biological samples. Total RNA was extracted using TRIzol (TaKaRa Bio, Dalian, China), and the extracted RNA was sent to Shanghai New Bioinformatics Co., Ltd. to construct a cDNA library using an Illumina HiSeqTM 2000 platform for double-end PE150 sequencing with 6G data per sample. Unreliable bases and reads were filtered out to obtain clean data for the six samples. The TopHat software (version 2.1.0) was used to locate clean reads on the human reference genome (GENCODE download, GRCh38) [12]. The read count information on each gene alignment was obtained using the htseq-count tool (version 0.9.1) based on the human gene annotation information provided by GENCODE (Release 25).

Inter-sample expression level and principal component analysis
The cor function of the R software (version: 3.4.1) was used to calculate the similarity between the two samples in each experiment. The prcomp function of the R software was utilized to reduce the dimensionality of the data. The ggfortify package (version: 0.4.6) created PCA plots for principal component analysis.

DEG screening
First, the raw count was normalized using the TMM algorithm in the edgeR package [13,14] (version: 3.4). Second, the mean-variance relationship was modeled with the exact weighting method (voom) provided by the limma package [15] (version: 3.36.2). Then, using linear regression and empirical Bayesian methods provided by the limma package, differential expression analysis was performed on the Hy and control groups. The differential expression threshold for DEGs was set to P < 0.05, |logFC| > 0.585.

Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO)
GO [16] functional annotation and KEGG [17] enrichment analysis of the DEGs were performed using the DAVID (version 6.8, https ://david -d.ncifc rf.gov/) [18]. P < 0.05 and enrichment count of at least 3 were considered thresholds for significant enrichment results.

Protein-protein interaction network (PPI) and PPI module analysis
The STRING (version 10.0, http://www.strin g-db.org/) database [19] was used to predict whether gene-encoded proteins interact with each other. A PPI network was constructed for the DEGs with the STRING database (parameter setting: species = homo; PPI score = 0.9). After obtaining the PPI relationship, a network diagram was constructed with Cytoscape (version 3.4.0, http:// chian ti.ucsd.edu/cytos cape-3.4.0/) [20]. CytoNCA plugin [21] (version 2.1.6, parameter setting: default) for Cytoscape was used to analyze the topological properties of the node network. The hub protein in the PPI network was obtained by ranking the network topology properties for each node.
The MCODE plugin [22] (version 1.5.1, parameter setting: default) for Cytoscape was used to screen protein complexes or functional modules. The modules with a score > 5 in the screening result were analyzed for KEGG path enrichment using the R package clusterProfiler [23] (version: 3.8.1).

Transcription factor prediction
The genes corresponding to the proteins identified in the PPI network were used as candidate genes, and transcription factors (TFs) were predicted with the TRRUST (version 2, http://www.grnpe dia.org/trrus t/, threshold setting: q-value < 0.05, number of target genes ≥ 2) [24]. The predicted TFs were compared with the DEGs to obtain differential TFs, and the transcription regulatory network (TF-target network) was constructed utilizing the Cytoscape software.

Survival analysis of key genes
The dataset for survival analysis was obtained from the UCSC database (http://xena.ucsc.edu/) [25], which contains TCGA-related data. Cancer samples with available patient survival information (n = 291) were selected, and the TCGA cervical cancer clinical data were used to extract the clinical information related to prognosis. The genes corresponding to the hub proteins obtained from the PPI network and the TFs in the TF-target regulatory network were utilized as candidate genes, and candidate gene expression values were screened from the TCGA. The median values were divided into two groups (high expression and low expression). A log-rank statistical test was performed, and the threshold P value was set to < 0.05. The relationship between candidate genes and patient prognosis was analyzed, and a Kaplan-Meier survival curve was plotted.

RT-qPCR analysis
Key genes for RT-qPCR verification were selected based on the PPI networks, topological properties, TF analyses, logFC, and degree ranking data. RNA extraction was performed using Trizol (TaKaRa Bio, Dalian, China), and cDNA was synthesized with PrimeScript RT Master Mix (TaKaRa Bio, Tokyo, Japan). Subsequently, amplification was carried out based on the Power SYBR Green PCR Master Mix (Thermo Fisher Scientific, Waltham, MA, USA). After an initial denaturation step of 10 min at 95 °C, the product was routinely examined using a dissociation curve, and the amount of transcript was compared with the relative Ct method with glyceraldehyde 3-phosphate dehydrogenase (GAPDH) as an internal reference control. The 2 −ΔΔ Cq method was utilized for analysis of the experimental data. Primers and primer sequences for each gene are provided in Table 1.

Western blot analysis4
The MYC and TFRC genes, which were identified by RT-qPCR, were selected for western blot analysis. Hy-treated cells were lysed with RIPA9 (Beyotime Bio, Shanghai, China), and the bicinchoninic acid (BCA; Thermo Fisher Scientific) reaction was performed to quantify protein concentrations. Equal protein amounts were resolved using 10% SDS-PAGE and transferred to polyvinylidene fluoride membranes (Millipore, Billerica, MA, USA). The membranes were blocked with 5% skim milk for 1 h, and then one of the following primary antibodies was added: anti-c-Myc rabbit monoclonal antibody (mAb; 57 kDa, 1:1000 dilution, Abcam, Cambridge, MA, USA,), antitransferrin receptor (TFRC) rabbit monoclonal antibody (45 kDa, 1:5000 dilution, Abcam, Cambridge, MA, USA), or anti-GAPDH murine monoclonal antibody (36 kDa, 1:1000 dilution, Santa Cruz Biotechnology, CA, USA). After an overnight incubation at 4 °C, a secondary antibody (rabbit mAb, 1:10,000 or murine mAb, 1:5000) was added and incubated for 2 h at 37 °C. After development with the Millipore ECL system, the optical density of the target strips was analyzed using a chemiluminescent system (Tanon, Shanghai, China).

Statistical analysis
All experiments were replicated at least 3 times, and the data are presented as mean ± standard deviation. The results from CCK-8, IC50 values, qPCR, and western blot were analyzed using GraphPad Prism 5.0 software (GraphPad Prism, San Diego, CA). Student's t-test was utilized to compare differences between two groups. One-way ANOVA was applied for comparisons among three or more groups. Statistical signifcance was accepted for p < 0.05.
Furthermore, HeLa and C-33A cell viability decreased significantly with time (24, 48, and 72 h; Fig. 1b). Thus, Hy inhibited the proliferation of HeLa and C-33A cells in a dose-and time-dependent manner in vitro. The IC50 of Hy was 2 mM for C-33A cells and 4 mM for HeLa cells (Fig. 1b). Subsequent experiments included C-33A cells and 2 mM Hy.

Sequencing data analyses
After data processing, 14,000 genes were finally obtained. Based on the expression levels in each provided sample, the Pearson correlation coefficient between two samples is represented by an (r) value (Fig. 2a). The closer an (r) value is to 1, the higher the expression pattern similarity between samples. The average intragroup sample similarity was 0.977, whereas the average between-group sample similarity was 0.93. These data indicated that the samples were reasonable and the experimental results were reliable.
The results from a principal component analysis are shown in Fig. 2b. The Hy group was clearly distinct from the control group, with obvious DEGs in the Hy group and control group.

DEG analysis
Using the defined threshold, we obtained 1263 DEGs, including 754 upregulated and 509 downregulated genes. Based on a two-dimensional hierarchical clustering heat map of the 1263 DEG values (Fig. 2c), these genes clearly separated the samples in the pre-grouping (Fig. 2d).

Functional and pathway enrichment analysis
The 1263 DEGs were used for GO biological processes (BP) functional and KEGG pathway analyses ( Table 2). The GO_BP functional analysis determined that the downregulated DEGs were mainly enriched in mitochondrial translational elongation, mitochondrial translational termination, ribosomal large subunit biogenesis, and rRNA processing, and so on. Upregulated DEGs were mainly enriched in cell adhesion, cell division, mitotic cytokinesis, and homeostasis of cell types within a tissue, etc. The KEGG pathway analysis revealed that the downregulated DEGs were mainly enriched in RNA transport, p53 signaling pathway, and transcriptional misregulation in cancer. Upregulated DEGs were enriched in endocytosis and in the PPAR, p53, GnRH, and neurotrophin signaling pathways.

A PPI network and module mining of DEGs
A PPI network was obtained for a total of 435 nodes and 1130 relationship pairs (Fig. 3a). A Cytoscape software CytoNCA plug-in was used to analyze the topological properties of the DEGs in the network. The top 20 degree   (Table 3), which were key node proteins in the PPI network. In this network, a total of 32 functional sub-modules were identified, including nine with a score > 5 (Fig. 3a). KEGG_pathway enrichment was performed on the nine modules (Fig. 3b), which were enriched mainly in: module 1-ribosomes; module 2-endocytosis; module 3-oocyte meiosis; module 4-ribosome biogenesis in eukaryotes; module 5-ubiquitin-mediated proteolysis; module 6-protein digestion and absorption; module 7thermogenesis; module 8-proteasome; and module 9viral carcinogenesis.

TF prediction
For TF prediction, a total of 67 TFs were obtained. With reference to the DEGs, six differentially regulated TFs were identified, which included four upregulated and two downregulated TFs. They were combined with 24 upregulated genes. In the TF-target network, CDKN1A, ASS1, CXCR4, and TFRC were coincidentally regulated by two or three TFs, which may be important for the transcriptional regulation (Fig. 4). Therefore, CDKN1A, ASS1, CXCR4, and TFRC were identified as key genes.

Survival analysis of key genes
Based on the gene expression values and the TCGA cervical cancer clinical information, four genes were significantly associated with disease prognosis (P < 0.05). Among these, MYC was downregulated, whereas HSPA8, CLTC, and PRKACA were upregulated (Fig. 5). A survival curve analysis revealed that increased HSPA8, CLTC, and MYC expression and decreased PRKACA expression were associated with a worse prognosis.

Discussion
Hy significantly inhibited C-33A and HeLa human cervical cancer cell proliferation in a dose-and time-dependent manner. This finding is consistent with the previously described Hy-induced inhibition of human non-small cell carcinoma [26]. The mechanism of cell proliferation inhibition was further investigated in C-33 A cells.

Conclusions
In summary, Hy inhibits HeLa and C-33A cervical cancer cell proliferation, and regulates the transcription process in C-33A cells. These findings provide a new avenue for the clinical treatment of cervical cancer and a theoretical basis for the clinical application of Hy.