Cargando…
Identification of a 6-gene signature for the survival prediction of breast cancer patients based on integrated multi-omics data analysis
PURPOSE: To identify a gene signature for the prognosis of breast cancer using high-throughput analysis. METHODS: RNASeq, single nucleotide polymorphism (SNP), copy number variation (CNV) data and clinical follow-up information were downloaded from The Cancer Genome Atlas (TCGA), and randomly divide...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7654770/ https://www.ncbi.nlm.nih.gov/pubmed/33170908 http://dx.doi.org/10.1371/journal.pone.0241924 |
_version_ | 1783608113560027136 |
---|---|
author | Mo, Wenju Ding, Yuqin Zhao, Shuai Zou, Dehong Ding, Xiaowen |
author_facet | Mo, Wenju Ding, Yuqin Zhao, Shuai Zou, Dehong Ding, Xiaowen |
author_sort | Mo, Wenju |
collection | PubMed |
description | PURPOSE: To identify a gene signature for the prognosis of breast cancer using high-throughput analysis. METHODS: RNASeq, single nucleotide polymorphism (SNP), copy number variation (CNV) data and clinical follow-up information were downloaded from The Cancer Genome Atlas (TCGA), and randomly divided into training set or verification set. Genes related to breast cancer prognosis and differentially expressed genes (DEGs) with CNV or SNP were screened from training set, then integrated together for feature selection of identify robust biomarkers using RandomForest. Finally, a gene-related prognostic model was established and its performance was verified in TCGA test set, Gene Expression Omnibus (GEO) validation set and breast cancer subtypes. RESULTS: A total of 2287 prognosis-related genes, 131 genes with amplified copy numbers, 724 gens with copy number deletions, and 280 genes with significant mutations screened from Genomic Variants were closely correlated with the development of breast cancer. A total of 120 candidate genes were obtained by integrating genes from Genomic Variants and those related to prognosis, then 6 characteristic genes (CD24, PRRG1, IQSEC3, MRGPRX, RCC2, and CASP8) were top-ranked by RandomForest for feature selection, noticeably, several of these have been previously reported to be associated with the progression of breast cancer. Cox regression analysis was performed to establish a 6-gene signature, which can stratify the risk of samples from training set, test set and external validation set, moreover, the five-year survival AUC of the model in the training set and validation set was both higher than 0.65. Thus, the 6-gene signature developed in the current study could serve as an independent prognostic factor for breast cancer patients. CONCLUSION: This study constructed a 6-gene signature as a novel prognostic marker for predicting the survival of breast cancer patients, providing new diagnostic/prognostic biomarkers and therapeutic targets for breast cancer patients. |
format | Online Article Text |
id | pubmed-7654770 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-76547702020-11-18 Identification of a 6-gene signature for the survival prediction of breast cancer patients based on integrated multi-omics data analysis Mo, Wenju Ding, Yuqin Zhao, Shuai Zou, Dehong Ding, Xiaowen PLoS One Research Article PURPOSE: To identify a gene signature for the prognosis of breast cancer using high-throughput analysis. METHODS: RNASeq, single nucleotide polymorphism (SNP), copy number variation (CNV) data and clinical follow-up information were downloaded from The Cancer Genome Atlas (TCGA), and randomly divided into training set or verification set. Genes related to breast cancer prognosis and differentially expressed genes (DEGs) with CNV or SNP were screened from training set, then integrated together for feature selection of identify robust biomarkers using RandomForest. Finally, a gene-related prognostic model was established and its performance was verified in TCGA test set, Gene Expression Omnibus (GEO) validation set and breast cancer subtypes. RESULTS: A total of 2287 prognosis-related genes, 131 genes with amplified copy numbers, 724 gens with copy number deletions, and 280 genes with significant mutations screened from Genomic Variants were closely correlated with the development of breast cancer. A total of 120 candidate genes were obtained by integrating genes from Genomic Variants and those related to prognosis, then 6 characteristic genes (CD24, PRRG1, IQSEC3, MRGPRX, RCC2, and CASP8) were top-ranked by RandomForest for feature selection, noticeably, several of these have been previously reported to be associated with the progression of breast cancer. Cox regression analysis was performed to establish a 6-gene signature, which can stratify the risk of samples from training set, test set and external validation set, moreover, the five-year survival AUC of the model in the training set and validation set was both higher than 0.65. Thus, the 6-gene signature developed in the current study could serve as an independent prognostic factor for breast cancer patients. CONCLUSION: This study constructed a 6-gene signature as a novel prognostic marker for predicting the survival of breast cancer patients, providing new diagnostic/prognostic biomarkers and therapeutic targets for breast cancer patients. Public Library of Science 2020-11-10 /pmc/articles/PMC7654770/ /pubmed/33170908 http://dx.doi.org/10.1371/journal.pone.0241924 Text en © 2020 Mo et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Mo, Wenju Ding, Yuqin Zhao, Shuai Zou, Dehong Ding, Xiaowen Identification of a 6-gene signature for the survival prediction of breast cancer patients based on integrated multi-omics data analysis |
title | Identification of a 6-gene signature for the survival prediction of breast cancer patients based on integrated multi-omics data analysis |
title_full | Identification of a 6-gene signature for the survival prediction of breast cancer patients based on integrated multi-omics data analysis |
title_fullStr | Identification of a 6-gene signature for the survival prediction of breast cancer patients based on integrated multi-omics data analysis |
title_full_unstemmed | Identification of a 6-gene signature for the survival prediction of breast cancer patients based on integrated multi-omics data analysis |
title_short | Identification of a 6-gene signature for the survival prediction of breast cancer patients based on integrated multi-omics data analysis |
title_sort | identification of a 6-gene signature for the survival prediction of breast cancer patients based on integrated multi-omics data analysis |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7654770/ https://www.ncbi.nlm.nih.gov/pubmed/33170908 http://dx.doi.org/10.1371/journal.pone.0241924 |
work_keys_str_mv | AT mowenju identificationofa6genesignatureforthesurvivalpredictionofbreastcancerpatientsbasedonintegratedmultiomicsdataanalysis AT dingyuqin identificationofa6genesignatureforthesurvivalpredictionofbreastcancerpatientsbasedonintegratedmultiomicsdataanalysis AT zhaoshuai identificationofa6genesignatureforthesurvivalpredictionofbreastcancerpatientsbasedonintegratedmultiomicsdataanalysis AT zoudehong identificationofa6genesignatureforthesurvivalpredictionofbreastcancerpatientsbasedonintegratedmultiomicsdataanalysis AT dingxiaowen identificationofa6genesignatureforthesurvivalpredictionofbreastcancerpatientsbasedonintegratedmultiomicsdataanalysis |