Cargando…
A genome-wide approach to link genotype to clinical outcome by utilizing next generation sequencing and gene chip data of 6,697 breast cancer patients
BACKGROUND: The use of somatic mutations for predicting clinical outcome is difficult because a mutation can indirectly influence the function of many genes, and also because clinical follow-up is sparse in the relatively young next generation sequencing (NGS) databanks. Here we approach this proble...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4609150/ https://www.ncbi.nlm.nih.gov/pubmed/26474971 http://dx.doi.org/10.1186/s13073-015-0228-1 |
_version_ | 1782395778854027264 |
---|---|
author | Pongor, Lőrinc Kormos, Máté Hatzis, Christos Pusztai, Lajos Szabó, András Győrffy, Balázs |
author_facet | Pongor, Lőrinc Kormos, Máté Hatzis, Christos Pusztai, Lajos Szabó, András Győrffy, Balázs |
author_sort | Pongor, Lőrinc |
collection | PubMed |
description | BACKGROUND: The use of somatic mutations for predicting clinical outcome is difficult because a mutation can indirectly influence the function of many genes, and also because clinical follow-up is sparse in the relatively young next generation sequencing (NGS) databanks. Here we approach this problem by linking sequence databanks to well annotated gene-chip datasets, using a multigene transcriptomic fingerprint as a link between gene mutations and gene expression in breast cancer patients. METHODS: The database consists of 763 NGS samples containing mutational status for 22,938 genes and RNA-seq data for 10,987 genes. The gene chip database contains 5,934 patients with 10,987 genes plus clinical characteristics. For the prediction, mutations present in a sample are first translated into a ‘transcriptomic fingerprint’ by running ROC analysis on mutation and RNA-seq data. Then correlation to survival is assessed by computing Cox regression for both up- and downregulated signatures. RESULTS: According to this approach, the top driver oncogenes having a mutation prevalence over 5 % included AKT1, TRANK1, TRAPPC10, RPGR, COL6A2, RAPGEF4, ATG2B, CNTRL, NAA38, OSBPL10, POTEF, SCLT1, SUN1, VWDE, MTUS2, and PIK3CA, and the top tumor suppressor genes included PHEX, TP53, GGA3, RGS22, PXDNL, ARFGEF1, BRCA2, CHD8, GCC2, and ARMC4. The system was validated by computing correlation between RNA-seq and microarray data (r(2) = 0.73, P < 1E-16). Cross-validation using 20 genes with a prevalence of approximately 5 % confirmed analysis reproducibility. CONCLUSIONS: We established a pipeline enabling rapid clinical validation of a discovered mutation in a large breast cancer cohort. An online interface is available for evaluating any human gene mutation or combinations of maximum three such genes (http://www.g-2-o.com). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13073-015-0228-1) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4609150 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-46091502015-10-18 A genome-wide approach to link genotype to clinical outcome by utilizing next generation sequencing and gene chip data of 6,697 breast cancer patients Pongor, Lőrinc Kormos, Máté Hatzis, Christos Pusztai, Lajos Szabó, András Győrffy, Balázs Genome Med Research BACKGROUND: The use of somatic mutations for predicting clinical outcome is difficult because a mutation can indirectly influence the function of many genes, and also because clinical follow-up is sparse in the relatively young next generation sequencing (NGS) databanks. Here we approach this problem by linking sequence databanks to well annotated gene-chip datasets, using a multigene transcriptomic fingerprint as a link between gene mutations and gene expression in breast cancer patients. METHODS: The database consists of 763 NGS samples containing mutational status for 22,938 genes and RNA-seq data for 10,987 genes. The gene chip database contains 5,934 patients with 10,987 genes plus clinical characteristics. For the prediction, mutations present in a sample are first translated into a ‘transcriptomic fingerprint’ by running ROC analysis on mutation and RNA-seq data. Then correlation to survival is assessed by computing Cox regression for both up- and downregulated signatures. RESULTS: According to this approach, the top driver oncogenes having a mutation prevalence over 5 % included AKT1, TRANK1, TRAPPC10, RPGR, COL6A2, RAPGEF4, ATG2B, CNTRL, NAA38, OSBPL10, POTEF, SCLT1, SUN1, VWDE, MTUS2, and PIK3CA, and the top tumor suppressor genes included PHEX, TP53, GGA3, RGS22, PXDNL, ARFGEF1, BRCA2, CHD8, GCC2, and ARMC4. The system was validated by computing correlation between RNA-seq and microarray data (r(2) = 0.73, P < 1E-16). Cross-validation using 20 genes with a prevalence of approximately 5 % confirmed analysis reproducibility. CONCLUSIONS: We established a pipeline enabling rapid clinical validation of a discovered mutation in a large breast cancer cohort. An online interface is available for evaluating any human gene mutation or combinations of maximum three such genes (http://www.g-2-o.com). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13073-015-0228-1) contains supplementary material, which is available to authorized users. BioMed Central 2015-10-16 /pmc/articles/PMC4609150/ /pubmed/26474971 http://dx.doi.org/10.1186/s13073-015-0228-1 Text en © Pongor et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Pongor, Lőrinc Kormos, Máté Hatzis, Christos Pusztai, Lajos Szabó, András Győrffy, Balázs A genome-wide approach to link genotype to clinical outcome by utilizing next generation sequencing and gene chip data of 6,697 breast cancer patients |
title | A genome-wide approach to link genotype to clinical outcome by utilizing next generation sequencing and gene chip data of 6,697 breast cancer patients |
title_full | A genome-wide approach to link genotype to clinical outcome by utilizing next generation sequencing and gene chip data of 6,697 breast cancer patients |
title_fullStr | A genome-wide approach to link genotype to clinical outcome by utilizing next generation sequencing and gene chip data of 6,697 breast cancer patients |
title_full_unstemmed | A genome-wide approach to link genotype to clinical outcome by utilizing next generation sequencing and gene chip data of 6,697 breast cancer patients |
title_short | A genome-wide approach to link genotype to clinical outcome by utilizing next generation sequencing and gene chip data of 6,697 breast cancer patients |
title_sort | genome-wide approach to link genotype to clinical outcome by utilizing next generation sequencing and gene chip data of 6,697 breast cancer patients |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4609150/ https://www.ncbi.nlm.nih.gov/pubmed/26474971 http://dx.doi.org/10.1186/s13073-015-0228-1 |
work_keys_str_mv | AT pongorlorinc agenomewideapproachtolinkgenotypetoclinicaloutcomebyutilizingnextgenerationsequencingandgenechipdataof6697breastcancerpatients AT kormosmate agenomewideapproachtolinkgenotypetoclinicaloutcomebyutilizingnextgenerationsequencingandgenechipdataof6697breastcancerpatients AT hatzischristos agenomewideapproachtolinkgenotypetoclinicaloutcomebyutilizingnextgenerationsequencingandgenechipdataof6697breastcancerpatients AT pusztailajos agenomewideapproachtolinkgenotypetoclinicaloutcomebyutilizingnextgenerationsequencingandgenechipdataof6697breastcancerpatients AT szaboandras agenomewideapproachtolinkgenotypetoclinicaloutcomebyutilizingnextgenerationsequencingandgenechipdataof6697breastcancerpatients AT gyorffybalazs agenomewideapproachtolinkgenotypetoclinicaloutcomebyutilizingnextgenerationsequencingandgenechipdataof6697breastcancerpatients AT pongorlorinc genomewideapproachtolinkgenotypetoclinicaloutcomebyutilizingnextgenerationsequencingandgenechipdataof6697breastcancerpatients AT kormosmate genomewideapproachtolinkgenotypetoclinicaloutcomebyutilizingnextgenerationsequencingandgenechipdataof6697breastcancerpatients AT hatzischristos genomewideapproachtolinkgenotypetoclinicaloutcomebyutilizingnextgenerationsequencingandgenechipdataof6697breastcancerpatients AT pusztailajos genomewideapproachtolinkgenotypetoclinicaloutcomebyutilizingnextgenerationsequencingandgenechipdataof6697breastcancerpatients AT szaboandras genomewideapproachtolinkgenotypetoclinicaloutcomebyutilizingnextgenerationsequencingandgenechipdataof6697breastcancerpatients AT gyorffybalazs genomewideapproachtolinkgenotypetoclinicaloutcomebyutilizingnextgenerationsequencingandgenechipdataof6697breastcancerpatients |