Cargando…

Gene Screening in High-Throughput Right-Censored Lung Cancer Data

BACKGROUND: Advances in sequencing technologies have allowed collection of massive genome-wide information that substantially advances lung cancer diagnosis and prognosis. Identifying influential markers for clinical endpoints of interest has been an indispensable and critical component of the stati...

Descripción completa

Detalles Bibliográficos
Autores principales: Ke, Chenlu, Bandyopadhyay, Dipankar, Acunzo, Mario, Winn, Robert
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10100230/
https://www.ncbi.nlm.nih.gov/pubmed/37066112
http://dx.doi.org/10.3390/onco2040017
_version_ 1785025231172141056
author Ke, Chenlu
Bandyopadhyay, Dipankar
Acunzo, Mario
Winn, Robert
author_facet Ke, Chenlu
Bandyopadhyay, Dipankar
Acunzo, Mario
Winn, Robert
author_sort Ke, Chenlu
collection PubMed
description BACKGROUND: Advances in sequencing technologies have allowed collection of massive genome-wide information that substantially advances lung cancer diagnosis and prognosis. Identifying influential markers for clinical endpoints of interest has been an indispensable and critical component of the statistical analysis pipeline. However, classical variable selection methods are not feasible or reliable for high-throughput genetic data. Our objective is to propose a model-free gene screening procedure for high-throughput right-censored data, and to develop a predictive gene signature for lung squamous cell carcinoma (LUSC) with the proposed procedure. METHODS: A gene screening procedure was developed based on a recently proposed independence measure. The Cancer Genome Atlas (TCGA) data on LUSC was then studied. The screening procedure was conducted to narrow down the set of influential genes to 378 candidates. A penalized Cox model was then fitted to the reduced set, which further identified a 6-gene signature for LUSC prognosis. The 6-gene signature was validated on datasets from the Gene Expression Omnibus. RESULTS: Both model-fitting and validation results reveal that our method selected influential genes that lead to biologically sensible findings as well as better predictive performance, compared to existing alternatives. According to our multivariable Cox regression analysis, the 6-gene signature was indeed a significant prognostic factor (p-value < 0.001) while controlling for clinical covariates. CONCLUSIONS: Gene screening as a fast dimension reduction technique plays an important role in analyzing high-throughput data. The main contribution of this paper is to introduce a fundamental yet pragmatic model-free gene screening approach that aids statistical analysis of right-censored cancer data, and provide a lateral comparison with other available methods in the context of LUSC.
format Online
Article
Text
id pubmed-10100230
institution National Center for Biotechnology Information
language English
publishDate 2022
record_format MEDLINE/PubMed
spelling pubmed-101002302023-04-13 Gene Screening in High-Throughput Right-Censored Lung Cancer Data Ke, Chenlu Bandyopadhyay, Dipankar Acunzo, Mario Winn, Robert Onco (Basel) Article BACKGROUND: Advances in sequencing technologies have allowed collection of massive genome-wide information that substantially advances lung cancer diagnosis and prognosis. Identifying influential markers for clinical endpoints of interest has been an indispensable and critical component of the statistical analysis pipeline. However, classical variable selection methods are not feasible or reliable for high-throughput genetic data. Our objective is to propose a model-free gene screening procedure for high-throughput right-censored data, and to develop a predictive gene signature for lung squamous cell carcinoma (LUSC) with the proposed procedure. METHODS: A gene screening procedure was developed based on a recently proposed independence measure. The Cancer Genome Atlas (TCGA) data on LUSC was then studied. The screening procedure was conducted to narrow down the set of influential genes to 378 candidates. A penalized Cox model was then fitted to the reduced set, which further identified a 6-gene signature for LUSC prognosis. The 6-gene signature was validated on datasets from the Gene Expression Omnibus. RESULTS: Both model-fitting and validation results reveal that our method selected influential genes that lead to biologically sensible findings as well as better predictive performance, compared to existing alternatives. According to our multivariable Cox regression analysis, the 6-gene signature was indeed a significant prognostic factor (p-value < 0.001) while controlling for clinical covariates. CONCLUSIONS: Gene screening as a fast dimension reduction technique plays an important role in analyzing high-throughput data. The main contribution of this paper is to introduce a fundamental yet pragmatic model-free gene screening approach that aids statistical analysis of right-censored cancer data, and provide a lateral comparison with other available methods in the context of LUSC. 2022-12 2022-10-17 /pmc/articles/PMC10100230/ /pubmed/37066112 http://dx.doi.org/10.3390/onco2040017 Text en https://creativecommons.org/licenses/by/4.0/This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Ke, Chenlu
Bandyopadhyay, Dipankar
Acunzo, Mario
Winn, Robert
Gene Screening in High-Throughput Right-Censored Lung Cancer Data
title Gene Screening in High-Throughput Right-Censored Lung Cancer Data
title_full Gene Screening in High-Throughput Right-Censored Lung Cancer Data
title_fullStr Gene Screening in High-Throughput Right-Censored Lung Cancer Data
title_full_unstemmed Gene Screening in High-Throughput Right-Censored Lung Cancer Data
title_short Gene Screening in High-Throughput Right-Censored Lung Cancer Data
title_sort gene screening in high-throughput right-censored lung cancer data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10100230/
https://www.ncbi.nlm.nih.gov/pubmed/37066112
http://dx.doi.org/10.3390/onco2040017
work_keys_str_mv AT kechenlu genescreeninginhighthroughputrightcensoredlungcancerdata
AT bandyopadhyaydipankar genescreeninginhighthroughputrightcensoredlungcancerdata
AT acunzomario genescreeninginhighthroughputrightcensoredlungcancerdata
AT winnrobert genescreeninginhighthroughputrightcensoredlungcancerdata