Cargando…
Feature screening for survival trait with application to TCGA high-dimensional genomic data
BACKGROUND: In high-dimensional survival genomic data, identifying cancer-related genes is a challenging and important subject in the field of bioinformatics. In recent years, many feature screening approaches for survival outcomes with high-dimensional survival genomic data have been developed; how...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8918142/ https://www.ncbi.nlm.nih.gov/pubmed/35291482 http://dx.doi.org/10.7717/peerj.13098 |
_version_ | 1784668671431409664 |
---|---|
author | Wang, Jie-Huei Li, Cai-Rong Hou, Po-Lin |
author_facet | Wang, Jie-Huei Li, Cai-Rong Hou, Po-Lin |
author_sort | Wang, Jie-Huei |
collection | PubMed |
description | BACKGROUND: In high-dimensional survival genomic data, identifying cancer-related genes is a challenging and important subject in the field of bioinformatics. In recent years, many feature screening approaches for survival outcomes with high-dimensional survival genomic data have been developed; however, few studies have systematically compared these methods. The primary purpose of this article is to conduct a series of simulation studies for systematic comparison; the second purpose of this article is to use these feature screening methods to further establish a more accurate prediction model for patient survival based on the survival genomic datasets of The Cancer Genome Atlas (TCGA). RESULTS: Simulation studies prove that network-adjusted feature screening measurement performs well and outperforms existing popular univariate independent feature screening methods. In the application of real data, we show that the proposed network-adjusted feature screening approach leads to more accurate survival prediction than alternative methods that do not account for gene-gene dependency information. We also use TCGA clinical survival genetic data to identify biomarkers associated with clinical survival outcomes in patients with various cancers including esophageal, pancreatic, head and neck squamous cell, lung, and breast invasive carcinomas. CONCLUSIONS: These applications reveal advantages of the new proposed network-adjusted feature selection method over alternative methods that do not consider gene-gene dependency information. We also identify cancer-related genes that are almost detected in the literature. As a result, the network-based screening method is reliable and credible. |
format | Online Article Text |
id | pubmed-8918142 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-89181422022-03-14 Feature screening for survival trait with application to TCGA high-dimensional genomic data Wang, Jie-Huei Li, Cai-Rong Hou, Po-Lin PeerJ Bioinformatics BACKGROUND: In high-dimensional survival genomic data, identifying cancer-related genes is a challenging and important subject in the field of bioinformatics. In recent years, many feature screening approaches for survival outcomes with high-dimensional survival genomic data have been developed; however, few studies have systematically compared these methods. The primary purpose of this article is to conduct a series of simulation studies for systematic comparison; the second purpose of this article is to use these feature screening methods to further establish a more accurate prediction model for patient survival based on the survival genomic datasets of The Cancer Genome Atlas (TCGA). RESULTS: Simulation studies prove that network-adjusted feature screening measurement performs well and outperforms existing popular univariate independent feature screening methods. In the application of real data, we show that the proposed network-adjusted feature screening approach leads to more accurate survival prediction than alternative methods that do not account for gene-gene dependency information. We also use TCGA clinical survival genetic data to identify biomarkers associated with clinical survival outcomes in patients with various cancers including esophageal, pancreatic, head and neck squamous cell, lung, and breast invasive carcinomas. CONCLUSIONS: These applications reveal advantages of the new proposed network-adjusted feature selection method over alternative methods that do not consider gene-gene dependency information. We also identify cancer-related genes that are almost detected in the literature. As a result, the network-based screening method is reliable and credible. PeerJ Inc. 2022-03-10 /pmc/articles/PMC8918142/ /pubmed/35291482 http://dx.doi.org/10.7717/peerj.13098 Text en © 2022 Wang et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited. |
spellingShingle | Bioinformatics Wang, Jie-Huei Li, Cai-Rong Hou, Po-Lin Feature screening for survival trait with application to TCGA high-dimensional genomic data |
title | Feature screening for survival trait with application to TCGA high-dimensional genomic data |
title_full | Feature screening for survival trait with application to TCGA high-dimensional genomic data |
title_fullStr | Feature screening for survival trait with application to TCGA high-dimensional genomic data |
title_full_unstemmed | Feature screening for survival trait with application to TCGA high-dimensional genomic data |
title_short | Feature screening for survival trait with application to TCGA high-dimensional genomic data |
title_sort | feature screening for survival trait with application to tcga high-dimensional genomic data |
topic | Bioinformatics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8918142/ https://www.ncbi.nlm.nih.gov/pubmed/35291482 http://dx.doi.org/10.7717/peerj.13098 |
work_keys_str_mv | AT wangjiehuei featurescreeningforsurvivaltraitwithapplicationtotcgahighdimensionalgenomicdata AT licairong featurescreeningforsurvivaltraitwithapplicationtotcgahighdimensionalgenomicdata AT houpolin featurescreeningforsurvivaltraitwithapplicationtotcgahighdimensionalgenomicdata |