Skip to main content

An individualized immune prognostic signature in lung adenocarcinoma



Tumor immune infiltration is closely associated with clinical outcome in lung cancer. We aimed to develop an immune signature to improve the prognostic predictions of lung adenocarcinoma (LUAD).


We applied “Cell type Identification by Estimating Relative Subsets of RNA Transcripts” method to quantify the fraction of 22 leukocyte cells from six public microarray datasets. Four datasets from GPL570 were treated as the training cohort and two datasets from GPL96 and GPL10379 as the validation cohorts. An immune risk score (IRS) based on leukocyte cell fraction was established by least absolute shrinkage and selection operator cox regression model.


IRS consisting of 6 types of leukocytes was constructed in the training dataset. In the training cohort (520 patients), the IRS stratified patients into high-IRS group (215 patients) and low-IRS group (305 patients) with significant differences in overall survival (OS) (HR: 2.77, 95% CI 2.08–3.06). Multivariate analysis including age, gender, stage, IRS and tumor purity revealed the IRS to be an independent prognostic factor in all datasets (training: HR: 10.71, 95% CI 5.72–20.07; validation-1: HR 2.68, 95% CI 1.15–6.27; validation-2: HR 3.71, 95% CI 1.33–10.33); all p < 0.05). IRS was significantly positively correlated to the expression levels of PD1, PDL1, CTLA and LAG3 (all p < 0.001). When integrated with clinical characteristics including stage and age, the composite immune and clinical signature presented with improved prognostic accuracy than IRS (mean C-index 0.66 vs. 0.60).


The proposed immune-clinical signature could predict OS in patients with LUAD effectively.


Non-small cell lung cancer accounts for 85% of all lung cancers, the most common cancer and cause of cancer-related mortality world widely [1]. Lung adenocarcinoma (LUAD) is the most diagnosed histological subtype of non-small cell lung cancer [2, 3]. Due to the presence of metastatic disease at an early stage, the prognosis for patients with LUAD is generally poor, with average 5-year survival rates of < 20% [4]. Conventionally, clinical decisions regarding cancer treatment and prognosis are based primarily on the AJCC staging system [5].

However, increasing evidence has revealed the clinical importance of tumor-infiltrating immune cells in lung cancer [6,7,8,9,10], combining the survival impact of immune cells with the AJCC staging system could enable clinicians to predict patient survival outcomes more accurately. Therefore, understanding the immune components by gene expression-based algorithms may be helpful for promoting studies of immune response in LUAD. The availability of public genomic datasets provides an ideal resource for large-scale gene expression analysis to identify reliable lung cancer biomarkers [11].

High resolving power is a key benefit of “Cell type Identification by Estimating Relative Subsets of RNA Transcripts” (CIBERSORT), which applies LM22 signature matrix to quantify the relative proportions of 22 immune cell types [12]. Because of the superiority of CIBERSORT algorithm over other methods regarding noise, closely related cell subsets and unknown cell types, it has received increasing attention and has been successfully applied to quantify the composition of immune cells in colon, breast, liver cancer and LUAD [13,14,15,16,17].

Therefore, we used the estimated proportions of 22 leukocytes derived from microarray gene expression data to construct and validate an IRS for patients with LUAD. To combine the complementary value of IRS for overall survival (OS) with clinical characteristics, we integrated the IRS with clinical factors to develop a composite prognostic signature, which showed improved prediction of LUAD prognosis.



The gene expression data and corresponding clinical characteristics of LUAD patients from Affymetrix® (Affymetrix, Santa Clara, California, USA) were downloaded from the Gene Expression Omnibus (GEO) websites. Datasets selection criterion was as follows: 1) probe-level CEL files of microarray data were available; 2) the basic clinicopathological information (age, gender, stage and survival information) was available; 3) the sample size was larger than 180. Therefore, six datasets (GSE31210 [18], GSE30219 [19], GSE37745 [20], GSE50081 [21], GSE68465 [22] and GSE72094 [23]) were enrolled into our study. Four GEO datasets (GSE30219, GSE31210, GSE37745 and GSE50081) from GPL570 were treated as the training cohort. Moreover, we employed two independent GEO datasets, GSE68465 from GPL96 and GSE72094 from GPL10379, as the validation cohorts.

Re-analysis of microarray data

Six GEO datasets were downloaded as probe-level CEL files. Then, the microarray data were normalized using Robust multiarray average (RMA) method with the affy and simpleaffy packages. The datasets used in the training cohort were quantile normalized after adjusting for batch effects using “combat” function (sva package, R 3.5.3) [24].

Estimation of immune cell type fractions

Gene expression data were subsequently analyzed using the LM22 gene signature and CIBERSORT method to estimate the fractions of 22 tumor infiltrating leukocytes subsets [12]. The CIBERSORT algorithm is well developed and has been verified by fluorescence-activated cell sorting [12]. CIBERSORT derives a p value for the deconvolution of each sample using Monte Carlo sampling, providing a measure of confidence for the results. Patients with a CIBERSORT output of p < 0.05 indicated that the results of the estimated fractions of immune cell populations can be considered accurate [14]. For each tumor sample, the final CIBERSORT output estimates were normalized and the sum of all estimates of immune cell type fractions yields to one.

Study population and clinical variables

Samples with CIBERSORT p value ≥ 0.05 were excluded, as were those with normal and non LUAD samples and patients for whom survival information or relevant clinical information was unknown. Clinical information including age, gender and TNM stage was collected. In this study, tumors were staged following the seventh edition of the AJCC staging system [25]. “Estimation of STromal and Immune cells in Malignant Tumours using Expression data” (ESTIMATE) algorithm was applied to calculate the stromal and immune scores of each sample and tumor purity can be evaluated using the formula reported before [26].

Construction of IRS

The survminer package [27] was applied to determine the optimal cut-off values for each immune cell fraction in the training dataset. Then, the leukocyte fraction level was scored as 0 or 1; a leukocyte fraction level of 1 was assigned when the fraction of one type of leukocyte was more than the corresponding cut-off value; otherwise the fraction level was 0. Thus, the 22 leukocytes were then analyzed as binary variables. To minimize the risk of overfitting, a cox proportional hazards regression model combined with the least absolute shrinkage and selection operator (LASSO) [28] was applied to identify the most important prognostic immune cells, and the optimal values of the penalty parameter λ were determined by tenfold cross-validations at 1 SE beyond the minimum partial likelihood deviance in the training dataset [29]. An IRS model was constructed based on the selected immune cells using lasso cox regression coefficients derived from the training cohort. To separate patients into low- or high-IRS groups, the optimal IRS cutoff was also generated based on the association between IRS and OS using the survminer package.

Validation of the IRS

The predictive value of IRS for OS was evaluated in all patients and in subgroups stratified by age, gender, TNM-stage and tumor purity in the training dataset, validation dataset-1 and validation dataset-2 with univariate cox analysis. We also combined IRS with other available variables in multivariate analysis (MVA).

Establishing and validation of immune clinical score

According to the results of MVA in the training dataset, IRS, age and stage were significantly associated with OS. Thus, we integrated IRS, age, and stage to composite an immune clinical score (ICS) using cox proportional hazards regression in the training dataset. Stage was treated as continuous variable: stage I was assigned as 1; II, as 2; III, as 3; and IV, as 4. The prognostic performance of continuous ICS was compared with that of the IRS in terms of C-index. Meanwhile, the sensitivity and specificity of the OS prediction based on the IRS and ICS were evaluated using a time-dependent receiver operating characteristic (ROC) curve [30]. Similar to the aforementioned method for determining the optimal cutoff of IRS, the cutoff value for ICS was also generated using the survminer package. Restricted mean survival time (RMST) represents the life expectancy at 10 years for training dataset and validation dataset-1 and at 4 years for validation dataset-2 because of shorter follow-up time. The performance of binary IRS and ICS was evaluated in terms of the RMST ratio between low- and high-risk groups [31]. Accordingly, a higher RMST ratio indicates a larger prognostic difference.

Statistical analysis

All statistical analysis was conducted using R software (version 3.5.3) and SPSS software (version 25.0). The correlations between the IRS and mRNA expression level of corresponding genes were analyzed using Pearson’s correlation test. Gene set enrichment analysis (GSEA) was used to identify the pathways that were significantly enriched in high-IRS and low-IRS groups [32]. Kaplan–Meier method was used to generate survival curves and significance of differences was compared using the log-rank test. Hazard ratios for univariate analysis were calculated using univariate cox proportional hazards regression model. The RMST ratio was estimated with survRM2 package [33]. All statistical tests were two-sided and P values less than 0.05 were considered statistically significant.


Patient characteristics

The patient selection criteria and workflow chart are shown in Fig. 1. After applying data filter scheme, 1337 LUADs were used for further analysis. The mean age at diagnosis was 65.20 years and 657 (49.14%) patients were male. Most patients (88.33%) were early stage (stage I or II) diseases and the mean tumor purity was 0.50. Detailed patient characteristics are listed in Table 1.

Fig. 1
figure 1

Flow chart of data collection and analysis. CIBERSORT Cell type Identification by Estimating Relative Subsets of RNA Transcripts, LASSO least absolute shrinkage and selection operator, LUAD lung adenocarcinoma

Table 1 Patients’ basic characteristics

Derivation of the IRS

The optimal cut-off values were generated for 22 leukocytes in the training cohort (Additional file 1: Table S1). LASSO cox regression analysis was used to build an IRS model (Fig. 2a). Six leukocyte subsets were identified to calculate the IRS as following:

$$\begin{aligned} {\text{IRS}} & = \left( {\left( { - 0.16527275} \right) \times {\text{fraction}}\;{\text{level}}\;{\text{of}}\;{\text{plasma}}\;{\text{cells}}} \right) + \left( {\left( { - 0.294722231} \right) \times {\text{fraction}}\;{\text{of}}\;{\text{T}}\;{\text{cells}}\;{\text{CD}}4\;{\text{Memory}}\;{\text{resting}}} \right) \\ & \quad + \left( {\left( {0.5352580} \right) \times {\text{fraction}}\;{\text{level}}\;{\text{of}}\;{\text{Macrophages}}\;{\text{M}}0} \right) + \left( {\left( { - 0.3803774} \right) \times {\text{fraction}}\;{\text{level}}\;{\text{of}}\;{\text{Mast}}\;{\text{cells}}\;{\text{resting}}} \right) \\ & \quad + \left( {\left( {0.21355828} \right) \times {\text{fraction}}\;{\text{level}}\;{\text{of}}\;{\text{Mast}}\;{\text{cells}}\;{\text{activated}}} \right) + \left( {\left( {0.3492314} \right) \times {\text{fraction}}\;{\text{level}}\;{\text{of}}\;{\text{Neutrphils}}} \right) \\ \end{aligned}.$$

Patients in the training cohort were then assigned into a high- IRS group (215 patients) and low-IRS group (305 patients) by the cut-off value (− 0.1652727). The Kaplan–Meier curve showed the patients in the high-IRS group presented with a significantly worse OS in the training dataset (HR 2.77, 95% CI 2.78–3.67, p < 0.01) (Fig. 2b). The median OS was 11.19 years in low-IRS group vs. 4.47 years in the high-IRS group (p < 0.01). The association between the IRS and OS was further investigated in the multivariable Cox regression model (HR: 10.71, 95% CI 5.72–20.07) (Table 2).

Fig. 2
figure 2

IRS construction and validation. a Partial likelihood deviance of different numbers of variables revealed by the LASSO regression model. The red dots represent the partial likelihood deviance values, the grey lines represent the standard error (SE), the two vertical dotted lines on the left and right, respectively, represent optimal values by minimum criteria and 1-SE criteria. bd Kaplan–Meier curves of OS between high and low IRS groups in the training cohort (b), validation dataset-1 (c) and validation dataset-2 (d); IRS immune risk score, LASSO least absolute shrinkage and selection operator, OS overall survival

Table 2 Multivariate cox analysis of immune risk score and clinical variables in training dataset, validation dataset-1 and validation dataset-2

Validation of IRS for predicting overall survival in the validation dataset-1 and validation dataset-2

To ensure that the constructed IRS possessed predictive value for OS in different cohorts, the same formula derived from the training cohort was applied to the validation dataset-1 and validation dataset-2. Patients were assigned to high- or low-IRS group by the cut-off values acquired from the corresponding cohort (validation dataset-1, 0.04828553; validation dataset-2, − 0.03803774). In the validation dataset-1, 183 patients were assigned into low-IRS group and 253 patients were assigned into high-IRS group. As for validation dataset-2, 109 patients were assigned into low-IRS group and 272 patients were assigned into high-IRS group. Consistent with the findings in the training cohort, patients in the high-IRS group presented with a significantly worse OS than those in the low-IRS group in the validation dataset-1 (HR 1.56, 95% CI 1.20–2.02) (Fig. 2c) and the validation dataset-2 (HR 1.83, 95% CI 1.24–2.78) (Fig. 2d). The median OS was 7.81 years in low-IRS group vs. 6.16 years in the high-IRS group in the validation dataset-1 and 4.30 years in low-IRS group vs. 3.48 years in the high-IRS group in the validation dataset-2 (both p < 0.01). The IRS remained as an independent prognostic factor in MVA, after adjusting for clinical characteristics such as age, gender, TNM stage and purity in validation dataset-1 (HR 2.68, 95% CI 1.15–6.27) and the validation dataset-2 (HR 3.71, 95% CI 1.33–10.33) (both p < 0.05) (Table 2).

The IRS was associated with OS in early stage patients

To further investigate the impacts of clinical characteristics on the prognostic values of the IRS, we conducted stratified analysis according to the baseline characteristics. As shown in Table 3, LUADs were stratified by available baseline characteristics (including age, gender, TNM stage and tumor purity). According to the results of stratified analysis, the IRS discriminated patients with early-stage (I and II) LUAD into significantly different prognostic groups in training dataset, validation dataset-1 and validation dataset-2 (all p < 0.01) (Table 3). When considering LUADs with stage I disease only, the IRS remained highly prognostic for the meta-overall dataset (combined HR: 2.01, 95% CI 1.26–3.22; p < 0.01) (Additional file 1: Fig. S1).

Table 3 The association between high- and low- immune risk score and OS of LUAD patients in training, validation dataset-1 and validation dataset-2

Biological phenotypes associated with the IRS model

Gene expression data were analyzed to investigate the potential biological phenotypes associated with the IRS model in the training dataset. Firstly, we specially focused on some immune check points and the correlation plot depicted in Fig. 3a showed that the IRS was significantly positively correlated to the expression levels of PD1, PDL1, CTLA and LAG3 (all p < 0.001). Secondly, as for some immune-activated related transcripts such as GZMA, GZMB, CXCL10 and IFNG, IRS was also significantly positively correlated to the expression levels of them (all p < 0.001) (Fig. 3b).Finally, we performed GSEA to illuminate the biological functions of the IRS model. The results showed that in the high-IRS group genes were significantly enriched in multiple biological processes such as cell cycle pathway and p53 signaling pathway, while in the low-IRS group genes were associated with the metabolism-related gene set, including fatty acid metabolism and propanoate metabolism (Fig. 3c).

Fig. 3
figure 3figure 3

Biological function of IRS in the training dataset. a The correlation between IRS and immune checkpoint regulators and y axis represents the expression levels of certain genes. b The correlation between IRS and immune-activated related transcripts and y axis represents the expression levels of certain genes. c Gene set enrichment analysis delineates biological pathways between high- and low-IRS groups. IRS immune risk score

Integrated prognostic score combining the IRS with clinical factors

In MVA (Table 2), IRS, age and stage were prognostic factors in at least two datasets, implying their complementary value for predicting OS. To further improve prediction accuracy, we combined IRS, age and stage to fit a Cox proportional hazards regression model in the training cohort and derived an Immune Clinical Score (ICS): \({\text{ICS}} = \left( {2.68575 \times {\text{IRS}}} \right) + \left( {0.03221 \times {\text{age}}} \right) + \left( {0.50289 \times {\text{stage}}} \right)\). Improved estimation of OS was achieved by the continuous form of ICS compared with IRS (C-index, 0.66 vs. 0.64 in the training dataset) (Additional file 1: Table S2). The prognostic accuracy of the ICS as a continuous variable was also evaluated by time-dependent ROC analysis (Fig. 4a). An optimal cutoff of 2.135404 for stratifying patients was determined in the training dataset. Similar results were observed in binary form of the ICS compared with the IRS (RMST ratio, 1.56 vs. 1.47 in the training dataset) (Table 4 and Fig. 5a).

Fig. 4
figure 4

IRS and ICS measured by time-dependent ROC curves at 5 years in the training dataset (a), validation dataset-1 (b), and validation dataset-2 (c). AUC area under the curve, IRS immune risk score, ICS immune clinical score, ROC receiver operator characteristic

Table 4 RMST ratio between low- and high-risk groups based on immune risk score or immune clinical score in training, validation dataset-1 and validation dataset-2
Fig. 5
figure 5

Kaplan–Meier curves for overall survival of all patients stratified by the IRS and the ICS in the training dataset (a), validation dataset-1 (b) and validation dataset-2 (c). IRS immune risk score, ICS immune clinical score


In this study, we developed an immune prognostic signature based on the 6 leukocytes and validated it in two independent datasets from different platforms. The results showed a significantly discriminative ability of OS between patients with high- and low- IRS. In addition, IRS can further stratify clinically defined groups of patients (especially early-stage) into subgroups with different survival outcomes. The IRS was significantly positively correlated to the expression levels of some immune check points and immune-activated related transcripts. We further investigated the complementary value of IRS and clinical characteristics and found that integrating both could give a more accurate estimation of OS for patients with LUAD.

In recent years, immune profiling studies have taken up a research focus in cancer study [9]. In LUAD, several studies have explored the association between tumor-infiltrating lymphocytes and patients’ survival. High CD4+ T cell in stroma correlated with longer OS [34] and disease-specific survival [35] and plasma cell infiltration was related to worse prognosis in LUAD patients [36]. It has been argued that macrophages may have a potential role in lung cancer by supporting both host-defense and tumor progression [37]. Mast cells were regarded a double-edged sword in cancer immunity: a higher density of mast cells was reported to correlate with improved survival in patients with LUAD [38] but activated mast cells presented with potential to exert immunosuppressive effects [39] and Takanami et al. [40] found increased mast cell infiltration in LUAD was associated with worse prognosis. Neutrophils represented a significant portion of infiltrating inflammatory cells and high neutrophil density was associated with a higher risk of relapse [41] and was a negative prognostic factor in LUAD [42]. To explore the potential role of tumor-infiltrating lymphocytes may require investigation comprehensively in tumor microenvironment.

Several models [43,44,45] based on immune cells have already been reported to present with strong ability for predicting prognosis in various types of tumors. Immunohistochemistry (IHC) is an important means of investigating tumor immune micro-environment [46] in these studies. But IHC suffers from limitations in available phenotypic markers [47] and provides only a snap shot of the tumor IME assayed on the slide [17]. In addition, a standardized measurement criterion of the intensity of protein staining, and subsequently quantitation of protein expression, was also difficult for IHC in nature [48].

As an alternative, continuously accumulating public genomic data provided an ideal resource for large-scale analysis of the immune landscape, and multiple computer-based algorithms have already been developed to perform such analysis [49]. The candidate immune cells used to construct the IRS were quantified based on a high-throughput gene expression data using bioinformatics tool CIBERSORT. By applying this computational method to public genomic data, it was possible to overcome some technical limitations of IHC and give an expanded insight into the immune profile in tumor. With further use of LASSO Cox regression model, functioning as a statistical method for screening prognosis-related immune cells to construct the IRS model, the predictive ability could be enhanced significantly [50,51,52]. We compared our results with reports from Yang et al. and found that our IRS model using much less immune cell types showed better predictive ability than IRS model from Yang [53] (mean 5-year AUC of 0.66 vs. 0.62). In addition, we included tumor purity into our analysis to adjust IRS, enabling our results more reliable. The value of the IRS was confirmed in two non-overlapping validation cohorts, indicating its excellent reproducibility for LUAD.

Patients with early-stage lung cancer are also at substantial risk for recurrence and death [54], even after complete surgical treatment and the use of adjuvant therapy in early-stage lung cancer remains controversial. An important finding in our analysis was that the IRS was significantly associated with OS in stage I/II LUAD patients. The prognostic role of IRS was also confirmed in patients with stage I for meta-overall dataset, implying IRS may provide a powerful prognostic indicator for selecting potential patients benefiting from additional therapy.

FDA has approved IHC PDL1 expression as predictive biomarker for response to anti-PD1 therapy for patients with NSCLC [55, 56]. Since our study revealed obvious enrichment of multiple immune checkpoint markers and immune activation transcripts, especially PDL1, in the high-IRS group, it is reasonable to speculate that immunotherapy might also be a preferable choice for patients in this group. Although patients in high-IRS group presented with poor OS, the application of immunotherapy may bring potential survival benefit. Further studies are warranted to explore whether the IRS model can predict the response of patients with LUAD to immunotherapy. The ICS integrating the IRS and baseline characteristics not only helped clinicians predict patient outcomes more precisely but boosted its procedure for translation into clinic utility.

This study has some limitations. First, as all patients in this study were selected retrospectively, the potential bias relating to unbalanced clinical features with treatment heterogeneity cannot be avoided. Secondly, the gene expression profiles used here were all derived from a core sample of tumor tissue, making it impossible for the location of the immune cell to be taken into consideration when establishing the prognostic IRS model. Thirdly, only signatures validated in independent cohorts of patients with full clinical annotation available could be applied clinically, and thus further investigations should focus on clinical validation for IRS, which may provide more evidence for its translation into clinical practice.


In conclusion, our study demonstrates the utility of consideration of tumor infiltrating leukocytes in the prognosis prediction of LUAD and may provide additional information and strategies for immunotherapy. Prospective studies are needed to further test its analytical accuracy for estimating prognosis and to validate its clinical utility in individualized management of LUAD patients.

Availability of data and materials

Six GEO datasets (GSE31210, GSE30219, GSE37745, GSE50081, GSE68465 and GSE72094) used in this study could be downloaded from GEO database (



Lung adenocarcinoma


Cell type Identification by Estimating Relative Subsets of RNA Transcripts


Overall survival


Immune risk score


Gene Expression Omnibus


Robust multiarray average


Estimation of STromal and Immune cells in Malignant Tumours using Expression data


Least absolute shrinkage and selection operator


Multivariate analysis


Immune clinical score


Receiver operating characteristic


Restricted mean survival time


Gene set enrichment analysis




  1. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2016. CA Cancer J Clin. 2016;66:7–30.

    Article  PubMed  Google Scholar 

  2. Travis WD. Pathology of lung cancer. Clin Chest Med. 2011;32:669–92.

    Article  PubMed  Google Scholar 

  3. Network CGAR. Comprehensive molecular profiling of lung adenocarcinoma. Nature. 2014;511:543.

    Article  CAS  Google Scholar 

  4. Lin JJ, Cardarella S, Lydon CA, Dahlberg SE, Jackman DM, Jänne PA, et al. Five-year survival in egfr-mutant metastatic lung adenocarcinoma treated with egfr-tkis. J Thorac Oncol. 2016;11:556–65.

    Article  PubMed  Google Scholar 

  5. Detterbeck FC, Boffa DJ, Tanoue LT. The new lung cancer staging system. Chest. 2009;136:260–71.

    Article  PubMed  Google Scholar 

  6. Remark R, Becker C, Gomez JE, Damotte D, Dieu-Nosjean M-C, Sautès-Fridman C, et al. The non-small cell lung cancer immune contexture. A major determinant of tumor characteristics and patient outcome. Am J Respir Crit Care Med. 2015;191:377–90.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Brambilla E, Le Teuff G, Marguet S, Lantuejoul S, Dunant A, Graziano S, et al. Prognostic effect of tumor lymphocytic infiltration in resectable non–small-cell lung cancer. J Clin Oncol. 2016;34:1223.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Kilic A, Landreneau RJ, Luketich JD, Pennathur A, Schuchert MJ. Density of tumor-infiltrating lymphocytes correlates with disease recurrence and survival in patients with large non-small-cell lung cancer tumors. J Surg Res. 2011;167:207–10.

    Article  PubMed  Google Scholar 

  9. Fridman WH, Pages F, Sautes-Fridman C, Galon J. The immune contexture in human tumours: impact on clinical outcome. Nat Rev Cancer. 2012;12:298.

    Article  CAS  PubMed  Google Scholar 

  10. Muppa P, Terra SPB, Sharma A, Mansfield AS, Aubry M-C, Bhinge K, et al. Immune cell infiltration may be a key determinant of long-term survival in small cell lung cancer. J Thorac Oncol. 2019;14(7):1286–95.

    Article  CAS  PubMed  Google Scholar 

  11. Xie Y, Minna JD. A lung cancer molecular prognostic test ready for prime time. Lancet. 2012;379:785–7.

    Article  PubMed  PubMed Central  Google Scholar 

  12. Newman AM, Liu CL, Green MR, Gentles AJ, Feng W, Xu Y, et al. Robust enumeration of cell subsets from tissue expression profiles. Nat Methods. 2015;12:453.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Xiong Y, Wang K, Zhou H, Peng L, You W, Fu Z. Profiles of immune infiltration in colorectal cancer and their clinical significant: a gene expression-based study. Cancer Med. 2018;7:4496–508.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Ali HR, Chlon L, Pharoah PD, Markowetz F, Caldas C. Patterns of immune infiltration in breast cancer and their clinical implications: a gene-expression-based retrospective study. PLoS Med. 2016;13:e1002194.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  15. Rohr-Udilova N, Klinglmüller F, Schulte-Hermann R, Stift J, Herac M, Salzmann M, et al. Deviations of the immune cell landscape between healthy liver and hepatocellular carcinoma. Sci Rep. 2018;8:6220.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  16. Mony JT, Schuchert MJ. Prognostic implications of heterogeneity in intra-tumoral immune composition for recurrence in early stage lung cancer. Front Immunol. 2018;9:2298.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  17. Kurbatov V, Balayev A, Saffarzadeh A, Heller DR, Boffa DJ, Blasberg JD, et al. Digital inference of immune microenvironment reveals low-risk subtype of early lung adenocarcinoma. Ann Thorac Surg. 2020;109:343–9.

    Article  PubMed  Google Scholar 

  18. Yamauchi M, Yamaguchi R, Nakata A, Kohno T, Nagasaki M, Shimamura T et al. Epidermal growth factor receptor tyrosine kinase defines critical prognostic genes of stage I lung adenocarcinoma. PloS ONE. 2012;7.

  19. Rousseaux S, Debernardi A, Jacquiau B, Vitte AL, Vesin A, Nagymignotte H, et al. Ectopic activation of germline and placental genes identifies aggressive metastasis-prone lung cancers. Sci Transl Med. 2013;5:86ra66.

    Article  CAS  Google Scholar 

  20. Jabs V, Edlund K, König H, Grinberg M, Micke P. Integrative analysis of genome-wide gene copy number changes and gene expression in non-small cell lung cancer. PLoS ONE. 2017;12:e0187246.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  21. Der SD, Sykes J, Pintilie M, et al. Validation of a histology-independent prognostic gene signature for early-stage, non-small-cell lung cancer including stage ia patients. J Throac Ocol. 2014;9:59–64.

    Article  CAS  Google Scholar 

  22. Shedden K, Taylor JMG, Enkemann SA, Tsao M-S, Yeatman TJ, Gerald WL, et al. Gene expression-based survival prediction in lung adenocarcinoma: a multi-site, blinded validation study. Nat Med. 2008;14:822–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Schabath MB, Welsh EA, Fulp WJ, Chen L, Teer JK, Thompson ZJ, et al. Differential association of STK11 and TP53 with KRAS mutation-associated gene expression, proliferation and immune surveillance in lung adenocarcinoma. Oncogene. 2016;35(24):3209–16.

    Article  CAS  PubMed  Google Scholar 

  24. Leek JT, Johnson WE, Parker HS, Jaffe AE, Storey JD. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics. 2012;28:882–3.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Goldstraw P, Crowley J, Chansky K, Giroux DJ, Groome PA, Rami-Porta R, et al. The iaslc lung cancer staging project: proposals for the revision of the TNM stage groupings in the forthcoming (seventh) edition of the TNM classification of malignant tumours. J Thorac Oncol. 2007;2:706–14.

    Article  PubMed  Google Scholar 

  26. Yoshihara K, Shahmoradgoli M, Martínez E, Vegesna R, Kim H, Torres-Garcia W, et al. Inferring tumour purity and stromal and immune cell admixture from expression data. Nat Commun. 2013;4:2612.

    Article  PubMed  CAS  Google Scholar 

  27. Kassambara A, Kosinski M, Biecek P. Survminer: drawing survival curves using’ggplot2’. R package version 03 2017;1.

  28. Goeman JJ. L1 penalized estimation in the cox proportional hazards model. Biometric J. 2010;52:70–84.

    Google Scholar 

  29. Simon N, Friedman J, Hastie T, Tibshirani R. Regularization paths for cox’s proportional hazards model via coordinate descent. J Stat Softw. 2011;39:1.

    Article  PubMed  PubMed Central  Google Scholar 

  30. Kamarudin AN, Cox T, Kolamunnage-Dona R. Time-dependent roc curve analysis in medical research: current methods and applications. BMC Med Res Methodol. 2017;17:53.

    Article  PubMed  PubMed Central  Google Scholar 

  31. Uno H, Claggett B, Tian L, Inoue E, Gallo P, Miyata T, et al. Moving beyond the hazard ratio in quantifying the between-group difference in survival analysis. J Clin Oncol. 2014;32:2380.

    Article  PubMed  PubMed Central  Google Scholar 

  32. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci. 2005;102:15545–50.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Uno H, Tian L, CroninA, Battioui C, Horiguchi M, Uno MH. Package ‘survrm2’. 2017.

  34. Wakabayashi O, Yamazaki K, Oizumi S, Hommura F, Kinoshita I, Ogura S, et al. CD4+ T cells in cancer stroma, not CD8+ T cells in cancer cell nests, are associated with favorable prognosis in human non-small cell lung cancers. Cancer Sci. 2003;94:1003–9.

    Article  CAS  PubMed  Google Scholar 

  35. Al-Shibli KI, Donnem T, Al-Saad S, Persson M, Bremnes RM, Busund L-T. Prognostic effect of epithelial and stromal lymphocyte infiltration in non-small cell lung cancer. Clin Cancer Res. 2008;14:5220–7.

    Article  CAS  PubMed  Google Scholar 

  36. Kurebayashi Y, Emoto K, Hayashi Y, Kamiyama I, Ohtsuka T, Asamura H, et al. Comprehensive immune profiling of lung adenocarcinomas reveals four immunosubtypes with plasma cell subtype a negative indicator. Cancer Immunol Res. 2016;4:234–47.

    Article  CAS  PubMed  Google Scholar 

  37. Kataki A, Scheid P, Piet M, Marie B, Martinet N, Martinet Y, et al. Tumor infiltrating lymphocytes and macrophages have a potential dual role in lung cancer by supporting both host-defense and tumor progression. J Lab Clin Med. 2002;140:320–8.

    Article  PubMed  Google Scholar 

  38. Tomita M, Matsuzaki Y, Onitsuka T. Correlation between mast cells and survival rates in patients with pulmonary adenocarcinoma. Lung Cancer. 1999;26:103–8.

    Article  CAS  PubMed  Google Scholar 

  39. Yang Z, Zhang B, Li D, Lv M, Huang C, Shen GX et al. Mast cells mobilize myeloid-derived suppressor cells and treg cells in tumor microenvironment via il-17 pathway in murine hepatocarcinoma model. Plos ONE. 2010;5.

  40. Takanami I, Takeuchi K, Naruke M. Mast cell density is associated with angiogenesis and poor prognosis in pulmonary adenocarcinoma. Cancer. 2000;88:2686–92.

    Article  CAS  PubMed  Google Scholar 

  41. Ilie M, Hofman V, Ortholan C, Bonnetaud C, Coëlle C, Mouroux J, et al. Predictive clinical outcome of the intratumoral CD66b-positive neutrophil-to-CD8-positive t-cell ratio in patients with resectable non-small cell lung cancer. Cancer. 2012;118:1726–37.

    Article  CAS  PubMed  Google Scholar 

  42. Rakaee M, Busund LT, Paulsen EE, Richardsen E, Kilvaer TK. Prognostic effect of intratumoral neutrophils across histological subtypes of non-small cell lung cancer. Oncotarget. 2016;7:72184–96.

    Article  PubMed  PubMed Central  Google Scholar 

  43. Galon J, Mlecnik B, Bindea G, Angell HK, Berger A, Lagorce C, et al. Towards the introduction of the ‘immunoscore’ in the classification of malignant tumours. J Pathol. 2014;232:199–209.

    Article  CAS  PubMed  Google Scholar 

  44. Angell H, Galon J. From the immune contexture to the immunoscore: the role of prognostic and predictive immune markers in cancer. Curr Opin Immunol. 2013;25:261–7.

    Article  CAS  PubMed  Google Scholar 

  45. Galon J, Pagès F, Marincola FM, Thurin M, Trinchieri G, Fox BA et al. The immune score as a new possible approach for the classification of cancer. BioMed Central. 2012.

  46. Busch SE, Hanke ML, Kargl J, Metz HE, MacPherson D, Houghton AM. Lung cancer subtypes generate unique immune responses. J Immunol. 2016;197:4493–503.

    Article  CAS  PubMed  Google Scholar 

  47. Zhou R, Zhang J, Zeng D, Sun H, Rong X, Shi M, et al. Immune cell infiltration as a biomarker for the diagnosis and prognosis of stage i–iii colon cancer. Cancer Immunol Immunother. 2019;68:433–42.

    Article  CAS  PubMed  Google Scholar 

  48. Zeng D, Zhou R, Yu Y, Luo Y, Zhang J, Sun H, et al. Gene expression profiles for a prognostic immunoscore in gastric cancer. Br J Surg. 2018;105:1338–48.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Finotello F, Trajanoski Z. Quantifying tumor-infiltrating immune cells from transcriptomics data. Cancer Immunol Immunother. 2018;67:1031–40.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Tibshirani R. The lasso method for variable selection in the cox model. Stat Med. 1997;16:385–95.

    Article  CAS  PubMed  Google Scholar 

  51. Lin T, Fu Y, Zhang X, Gu J, Ma X, Miao R, et al. A seven-long noncoding rna signature predicts overall survival for patients with early stage non-small cell lung cancer. Aging. 2018;10:2356.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Li B, Cui Y, Diehn M, Li R. Development and validation of an individualized immune prognostic signature in early-stage non-squamous non-small cell lung cancer. JAMA Oncol. 2017;3:1529–37.

    Article  PubMed  PubMed Central  Google Scholar 

  53. Song Q, Shang J, Yang Z, Zhang L, Zhang C, Chen J, et al. Identification of an immune signature predicting prognosis risk of patients in lung adenocarcinoma. J Transl Med. 2019;17(1):70.

    Article  PubMed  PubMed Central  Google Scholar 

  54. Holmes CE, Ruckdeschel JC, Johnston M, Thomas PA, Long S. Randomized trial of lobectomy versus limited resection for t1 n0 non-small-cell lung-cancer. Ann Thorac Surg. 1995;60:615–22.

    Article  Google Scholar 

  55. Topalian SL, Taube JM, Anders RA, Pardoll DM. Mechanism-driven biomarkers to guide immune checkpoint blockade in cancer therapy. Nat Rev Cancer. 2016;16(5):275–87.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Gibney GT, Weiner LM, Atkins MB. Predictive biomarkers for checkpoint inhibitor-based immunotherapy. Lancet Oncol. 2016;17(12):e542–51.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references


The authors thank Wang Binbin for his valuable advice, and thank the GEO database for providing valuable datasets.


This study was supported by National Natural Science Foundation of China (No. 81972172), the Shanghai Municipal Health Commission (Grant No. 2017BR026), the Shanghai Education Development Foundation (Grant No. 17SG23), the Shanghai Hospital Development Center (Grant No. SHDC12017X03).

Author information

Authors and Affiliations



LS substantially contributed to conception and design, acquisition of data, analysis and interpretation of data, and drafting the article. DG acquired part of the data. GJ and PZ contributed to conception and design, revising it critically for important intellectual content, and final approval of the version to be published. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Peng Zhang.

Ethics declarations

Ethics approval and consent to participate

Since this was a retrospective study and all data used in our study collected from GEO database (, ethical approval was not required.

Consent for publication


Competing interest

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1.

An individualized immune prognostic signature in lung adenocarcinoma. Liangdong Sun, Gening Jiang, Diego Gonzalez-Rivas and Peng Zhang.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, L., Jiang, G., Gonzalez-Rivas, D. et al. An individualized immune prognostic signature in lung adenocarcinoma. Cancer Cell Int 20, 156 (2020).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: