Cargando…

Feature screening for survival trait with application to TCGA high-dimensional genomic data

BACKGROUND: In high-dimensional survival genomic data, identifying cancer-related genes is a challenging and important subject in the field of bioinformatics. In recent years, many feature screening approaches for survival outcomes with high-dimensional survival genomic data have been developed; how...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Jie-Huei, Li, Cai-Rong, Hou, Po-Lin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8918142/
https://www.ncbi.nlm.nih.gov/pubmed/35291482
http://dx.doi.org/10.7717/peerj.13098
_version_ 1784668671431409664
author Wang, Jie-Huei
Li, Cai-Rong
Hou, Po-Lin
author_facet Wang, Jie-Huei
Li, Cai-Rong
Hou, Po-Lin
author_sort Wang, Jie-Huei
collection PubMed
description BACKGROUND: In high-dimensional survival genomic data, identifying cancer-related genes is a challenging and important subject in the field of bioinformatics. In recent years, many feature screening approaches for survival outcomes with high-dimensional survival genomic data have been developed; however, few studies have systematically compared these methods. The primary purpose of this article is to conduct a series of simulation studies for systematic comparison; the second purpose of this article is to use these feature screening methods to further establish a more accurate prediction model for patient survival based on the survival genomic datasets of The Cancer Genome Atlas (TCGA). RESULTS: Simulation studies prove that network-adjusted feature screening measurement performs well and outperforms existing popular univariate independent feature screening methods. In the application of real data, we show that the proposed network-adjusted feature screening approach leads to more accurate survival prediction than alternative methods that do not account for gene-gene dependency information. We also use TCGA clinical survival genetic data to identify biomarkers associated with clinical survival outcomes in patients with various cancers including esophageal, pancreatic, head and neck squamous cell, lung, and breast invasive carcinomas. CONCLUSIONS: These applications reveal advantages of the new proposed network-adjusted feature selection method over alternative methods that do not consider gene-gene dependency information. We also identify cancer-related genes that are almost detected in the literature. As a result, the network-based screening method is reliable and credible.
format Online
Article
Text
id pubmed-8918142
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-89181422022-03-14 Feature screening for survival trait with application to TCGA high-dimensional genomic data Wang, Jie-Huei Li, Cai-Rong Hou, Po-Lin PeerJ Bioinformatics BACKGROUND: In high-dimensional survival genomic data, identifying cancer-related genes is a challenging and important subject in the field of bioinformatics. In recent years, many feature screening approaches for survival outcomes with high-dimensional survival genomic data have been developed; however, few studies have systematically compared these methods. The primary purpose of this article is to conduct a series of simulation studies for systematic comparison; the second purpose of this article is to use these feature screening methods to further establish a more accurate prediction model for patient survival based on the survival genomic datasets of The Cancer Genome Atlas (TCGA). RESULTS: Simulation studies prove that network-adjusted feature screening measurement performs well and outperforms existing popular univariate independent feature screening methods. In the application of real data, we show that the proposed network-adjusted feature screening approach leads to more accurate survival prediction than alternative methods that do not account for gene-gene dependency information. We also use TCGA clinical survival genetic data to identify biomarkers associated with clinical survival outcomes in patients with various cancers including esophageal, pancreatic, head and neck squamous cell, lung, and breast invasive carcinomas. CONCLUSIONS: These applications reveal advantages of the new proposed network-adjusted feature selection method over alternative methods that do not consider gene-gene dependency information. We also identify cancer-related genes that are almost detected in the literature. As a result, the network-based screening method is reliable and credible. PeerJ Inc. 2022-03-10 /pmc/articles/PMC8918142/ /pubmed/35291482 http://dx.doi.org/10.7717/peerj.13098 Text en © 2022 Wang et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Wang, Jie-Huei
Li, Cai-Rong
Hou, Po-Lin
Feature screening for survival trait with application to TCGA high-dimensional genomic data
title Feature screening for survival trait with application to TCGA high-dimensional genomic data
title_full Feature screening for survival trait with application to TCGA high-dimensional genomic data
title_fullStr Feature screening for survival trait with application to TCGA high-dimensional genomic data
title_full_unstemmed Feature screening for survival trait with application to TCGA high-dimensional genomic data
title_short Feature screening for survival trait with application to TCGA high-dimensional genomic data
title_sort feature screening for survival trait with application to tcga high-dimensional genomic data
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8918142/
https://www.ncbi.nlm.nih.gov/pubmed/35291482
http://dx.doi.org/10.7717/peerj.13098
work_keys_str_mv AT wangjiehuei featurescreeningforsurvivaltraitwithapplicationtotcgahighdimensionalgenomicdata
AT licairong featurescreeningforsurvivaltraitwithapplicationtotcgahighdimensionalgenomicdata
AT houpolin featurescreeningforsurvivaltraitwithapplicationtotcgahighdimensionalgenomicdata