Skip to Main Content
This study presents a novel computational approach to identifying a smoking-associated gene signature. The methodology contains the following steps: 1) identifying genes significantly associated with lung cancer survival, 2) selecting genes which are differentially expressed in smoker versus non-smoker groups from the survival genes, 3) from these candidate genes, constructing gene co-expression networks based on prediction logic for smokers and non-smokers, 4) identifying smoking-mediated differential components, i.e., the unique gene co-expression patterns specific to each group, and 5) from the differential components, identifying genes directly co-expressed with major lung cancer hallmarks. The identified 7-gene signature could separate lung cancer patients into two risk groups with distinct postoperative survival (log-rank P <; 0.05, Kaplan-Meier analysis) in four independent cohorts (n=427). It also has implications in the diagnosis of lung cancer (accuracy = 74%) in a cohort of smokers (n=164). Computationally derived co-expression patterns were validated with Pathway Studio and STRING 8.