Cargando…
ContrastRank: a new method for ranking putative cancer driver genes and classification of tumor samples
Motivation: The recent advance in high-throughput sequencing technologies is generating a huge amount of data that are becoming an important resource for deciphering the genotype underlying a given phenotype. Genome sequencing has been extensively applied to the study of the cancer genomes. Although...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4147919/ https://www.ncbi.nlm.nih.gov/pubmed/25161249 http://dx.doi.org/10.1093/bioinformatics/btu466 |
_version_ | 1782332537780043776 |
---|---|
author | Tian, Rui Basu, Malay K. Capriotti, Emidio |
author_facet | Tian, Rui Basu, Malay K. Capriotti, Emidio |
author_sort | Tian, Rui |
collection | PubMed |
description | Motivation: The recent advance in high-throughput sequencing technologies is generating a huge amount of data that are becoming an important resource for deciphering the genotype underlying a given phenotype. Genome sequencing has been extensively applied to the study of the cancer genomes. Although a few methods have been already proposed for the detection of cancer-related genes, their automatic identification is still a challenging task. Using the genomic data made available by The Cancer Genome Atlas Consortium (TCGA), we propose a new prioritization approach based on the analysis of the distribution of putative deleterious variants in a large cohort of cancer samples. Results: In this paper, we present ContastRank, a new method for the prioritization of putative impaired genes in cancer. The method is based on the comparison of the putative defective rate of each gene in tumor versus normal and 1000 genome samples. We show that the method is able to provide a ranked list of putative impaired genes for colon, lung and prostate adenocarcinomas. The list significantly overlaps with the list of known cancer driver genes previously published. More importantly, by using our scoring approach, we can successfully discriminate between TCGA normal and tumor samples. A binary classifier based on ContrastRank score reaches an overall accuracy >90% and the area under the curve (AUC) of receiver operating characteristics (ROC) >0.95 for all the three types of adenocarcinoma analyzed in this paper. In addition, using ContrastRank score, we are able to discriminate the three tumor types with a minimum overall accuracy of 77% and AUC of 0.83. Conclusions: We describe ContrastRank, a method for prioritizing putative impaired genes in cancer. The method is based on the comparison of exome sequencing data from different cohorts and can detect putative cancer driver genes. ContrastRank can also be used to estimate a global score for an individual genome about the risk of adenocarcinoma based on the genetic variants information from a whole-exome VCF (Variant Calling Format) file. We believe that the application of ContrastRank can be an important step in genomic medicine to enable genome-based diagnosis. Availability and implementation: The lists of ContrastRank scores of all genes in each tumor type are available as supplementary materials. A webserver for evaluating the risk of the three studied adenocarcinomas starting from whole-exome VCF file is under development. Contact: emidio@uab.edu Supplementary information: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-4147919 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-41479192014-09-02 ContrastRank: a new method for ranking putative cancer driver genes and classification of tumor samples Tian, Rui Basu, Malay K. Capriotti, Emidio Bioinformatics Eccb 2014 Proceedings Papers Committee Motivation: The recent advance in high-throughput sequencing technologies is generating a huge amount of data that are becoming an important resource for deciphering the genotype underlying a given phenotype. Genome sequencing has been extensively applied to the study of the cancer genomes. Although a few methods have been already proposed for the detection of cancer-related genes, their automatic identification is still a challenging task. Using the genomic data made available by The Cancer Genome Atlas Consortium (TCGA), we propose a new prioritization approach based on the analysis of the distribution of putative deleterious variants in a large cohort of cancer samples. Results: In this paper, we present ContastRank, a new method for the prioritization of putative impaired genes in cancer. The method is based on the comparison of the putative defective rate of each gene in tumor versus normal and 1000 genome samples. We show that the method is able to provide a ranked list of putative impaired genes for colon, lung and prostate adenocarcinomas. The list significantly overlaps with the list of known cancer driver genes previously published. More importantly, by using our scoring approach, we can successfully discriminate between TCGA normal and tumor samples. A binary classifier based on ContrastRank score reaches an overall accuracy >90% and the area under the curve (AUC) of receiver operating characteristics (ROC) >0.95 for all the three types of adenocarcinoma analyzed in this paper. In addition, using ContrastRank score, we are able to discriminate the three tumor types with a minimum overall accuracy of 77% and AUC of 0.83. Conclusions: We describe ContrastRank, a method for prioritizing putative impaired genes in cancer. The method is based on the comparison of exome sequencing data from different cohorts and can detect putative cancer driver genes. ContrastRank can also be used to estimate a global score for an individual genome about the risk of adenocarcinoma based on the genetic variants information from a whole-exome VCF (Variant Calling Format) file. We believe that the application of ContrastRank can be an important step in genomic medicine to enable genome-based diagnosis. Availability and implementation: The lists of ContrastRank scores of all genes in each tumor type are available as supplementary materials. A webserver for evaluating the risk of the three studied adenocarcinomas starting from whole-exome VCF file is under development. Contact: emidio@uab.edu Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2014-09-01 2014-08-22 /pmc/articles/PMC4147919/ /pubmed/25161249 http://dx.doi.org/10.1093/bioinformatics/btu466 Text en © The Author 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Eccb 2014 Proceedings Papers Committee Tian, Rui Basu, Malay K. Capriotti, Emidio ContrastRank: a new method for ranking putative cancer driver genes and classification of tumor samples |
title | ContrastRank: a new method for ranking putative cancer driver genes and classification of tumor samples |
title_full | ContrastRank: a new method for ranking putative cancer driver genes and classification of tumor samples |
title_fullStr | ContrastRank: a new method for ranking putative cancer driver genes and classification of tumor samples |
title_full_unstemmed | ContrastRank: a new method for ranking putative cancer driver genes and classification of tumor samples |
title_short | ContrastRank: a new method for ranking putative cancer driver genes and classification of tumor samples |
title_sort | contrastrank: a new method for ranking putative cancer driver genes and classification of tumor samples |
topic | Eccb 2014 Proceedings Papers Committee |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4147919/ https://www.ncbi.nlm.nih.gov/pubmed/25161249 http://dx.doi.org/10.1093/bioinformatics/btu466 |
work_keys_str_mv | AT tianrui contrastrankanewmethodforrankingputativecancerdrivergenesandclassificationoftumorsamples AT basumalayk contrastrankanewmethodforrankingputativecancerdrivergenesandclassificationoftumorsamples AT capriottiemidio contrastrankanewmethodforrankingputativecancerdrivergenesandclassificationoftumorsamples |