Cargando…

Identification of a 6-gene signature for the survival prediction of breast cancer patients based on integrated multi-omics data analysis

PURPOSE: To identify a gene signature for the prognosis of breast cancer using high-throughput analysis. METHODS: RNASeq, single nucleotide polymorphism (SNP), copy number variation (CNV) data and clinical follow-up information were downloaded from The Cancer Genome Atlas (TCGA), and randomly divide...

Descripción completa

Detalles Bibliográficos
Autores principales: Mo, Wenju, Ding, Yuqin, Zhao, Shuai, Zou, Dehong, Ding, Xiaowen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7654770/
https://www.ncbi.nlm.nih.gov/pubmed/33170908
http://dx.doi.org/10.1371/journal.pone.0241924
_version_ 1783608113560027136
author Mo, Wenju
Ding, Yuqin
Zhao, Shuai
Zou, Dehong
Ding, Xiaowen
author_facet Mo, Wenju
Ding, Yuqin
Zhao, Shuai
Zou, Dehong
Ding, Xiaowen
author_sort Mo, Wenju
collection PubMed
description PURPOSE: To identify a gene signature for the prognosis of breast cancer using high-throughput analysis. METHODS: RNASeq, single nucleotide polymorphism (SNP), copy number variation (CNV) data and clinical follow-up information were downloaded from The Cancer Genome Atlas (TCGA), and randomly divided into training set or verification set. Genes related to breast cancer prognosis and differentially expressed genes (DEGs) with CNV or SNP were screened from training set, then integrated together for feature selection of identify robust biomarkers using RandomForest. Finally, a gene-related prognostic model was established and its performance was verified in TCGA test set, Gene Expression Omnibus (GEO) validation set and breast cancer subtypes. RESULTS: A total of 2287 prognosis-related genes, 131 genes with amplified copy numbers, 724 gens with copy number deletions, and 280 genes with significant mutations screened from Genomic Variants were closely correlated with the development of breast cancer. A total of 120 candidate genes were obtained by integrating genes from Genomic Variants and those related to prognosis, then 6 characteristic genes (CD24, PRRG1, IQSEC3, MRGPRX, RCC2, and CASP8) were top-ranked by RandomForest for feature selection, noticeably, several of these have been previously reported to be associated with the progression of breast cancer. Cox regression analysis was performed to establish a 6-gene signature, which can stratify the risk of samples from training set, test set and external validation set, moreover, the five-year survival AUC of the model in the training set and validation set was both higher than 0.65. Thus, the 6-gene signature developed in the current study could serve as an independent prognostic factor for breast cancer patients. CONCLUSION: This study constructed a 6-gene signature as a novel prognostic marker for predicting the survival of breast cancer patients, providing new diagnostic/prognostic biomarkers and therapeutic targets for breast cancer patients.
format Online
Article
Text
id pubmed-7654770
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-76547702020-11-18 Identification of a 6-gene signature for the survival prediction of breast cancer patients based on integrated multi-omics data analysis Mo, Wenju Ding, Yuqin Zhao, Shuai Zou, Dehong Ding, Xiaowen PLoS One Research Article PURPOSE: To identify a gene signature for the prognosis of breast cancer using high-throughput analysis. METHODS: RNASeq, single nucleotide polymorphism (SNP), copy number variation (CNV) data and clinical follow-up information were downloaded from The Cancer Genome Atlas (TCGA), and randomly divided into training set or verification set. Genes related to breast cancer prognosis and differentially expressed genes (DEGs) with CNV or SNP were screened from training set, then integrated together for feature selection of identify robust biomarkers using RandomForest. Finally, a gene-related prognostic model was established and its performance was verified in TCGA test set, Gene Expression Omnibus (GEO) validation set and breast cancer subtypes. RESULTS: A total of 2287 prognosis-related genes, 131 genes with amplified copy numbers, 724 gens with copy number deletions, and 280 genes with significant mutations screened from Genomic Variants were closely correlated with the development of breast cancer. A total of 120 candidate genes were obtained by integrating genes from Genomic Variants and those related to prognosis, then 6 characteristic genes (CD24, PRRG1, IQSEC3, MRGPRX, RCC2, and CASP8) were top-ranked by RandomForest for feature selection, noticeably, several of these have been previously reported to be associated with the progression of breast cancer. Cox regression analysis was performed to establish a 6-gene signature, which can stratify the risk of samples from training set, test set and external validation set, moreover, the five-year survival AUC of the model in the training set and validation set was both higher than 0.65. Thus, the 6-gene signature developed in the current study could serve as an independent prognostic factor for breast cancer patients. CONCLUSION: This study constructed a 6-gene signature as a novel prognostic marker for predicting the survival of breast cancer patients, providing new diagnostic/prognostic biomarkers and therapeutic targets for breast cancer patients. Public Library of Science 2020-11-10 /pmc/articles/PMC7654770/ /pubmed/33170908 http://dx.doi.org/10.1371/journal.pone.0241924 Text en © 2020 Mo et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Mo, Wenju
Ding, Yuqin
Zhao, Shuai
Zou, Dehong
Ding, Xiaowen
Identification of a 6-gene signature for the survival prediction of breast cancer patients based on integrated multi-omics data analysis
title Identification of a 6-gene signature for the survival prediction of breast cancer patients based on integrated multi-omics data analysis
title_full Identification of a 6-gene signature for the survival prediction of breast cancer patients based on integrated multi-omics data analysis
title_fullStr Identification of a 6-gene signature for the survival prediction of breast cancer patients based on integrated multi-omics data analysis
title_full_unstemmed Identification of a 6-gene signature for the survival prediction of breast cancer patients based on integrated multi-omics data analysis
title_short Identification of a 6-gene signature for the survival prediction of breast cancer patients based on integrated multi-omics data analysis
title_sort identification of a 6-gene signature for the survival prediction of breast cancer patients based on integrated multi-omics data analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7654770/
https://www.ncbi.nlm.nih.gov/pubmed/33170908
http://dx.doi.org/10.1371/journal.pone.0241924
work_keys_str_mv AT mowenju identificationofa6genesignatureforthesurvivalpredictionofbreastcancerpatientsbasedonintegratedmultiomicsdataanalysis
AT dingyuqin identificationofa6genesignatureforthesurvivalpredictionofbreastcancerpatientsbasedonintegratedmultiomicsdataanalysis
AT zhaoshuai identificationofa6genesignatureforthesurvivalpredictionofbreastcancerpatientsbasedonintegratedmultiomicsdataanalysis
AT zoudehong identificationofa6genesignatureforthesurvivalpredictionofbreastcancerpatientsbasedonintegratedmultiomicsdataanalysis
AT dingxiaowen identificationofa6genesignatureforthesurvivalpredictionofbreastcancerpatientsbasedonintegratedmultiomicsdataanalysis