Cargando…

Comparison of different functional prediction scores using a gene-based permutation model for identifying cancer driver genes

BACKGROUND: Identifying cancer driver genes (CDG) is a crucial step in cancer genomic toward the advancement of precision medicine. However, driver gene discovery is a very challenging task because we are not only dealing with huge amount of data; but we are also faced with the complexity of the dis...

Descripción completa

Detalles Bibliográficos
Autores principales: Nono, Alice Djotsa, Chen, Ken, Liu, Xiaoming
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6357357/
https://www.ncbi.nlm.nih.gov/pubmed/30704472
http://dx.doi.org/10.1186/s12920-018-0452-9
_version_ 1783391768100732928
author Nono, Alice Djotsa
Chen, Ken
Liu, Xiaoming
author_facet Nono, Alice Djotsa
Chen, Ken
Liu, Xiaoming
author_sort Nono, Alice Djotsa
collection PubMed
description BACKGROUND: Identifying cancer driver genes (CDG) is a crucial step in cancer genomic toward the advancement of precision medicine. However, driver gene discovery is a very challenging task because we are not only dealing with huge amount of data; but we are also faced with the complexity of the disease including the heterogeneity of background somatic mutation rate in each cancer patient. It is generally accepted that CDG harbor variants conferring growth advantage in the malignant cell and they are positively selected, which are critical to cancer development; whereas, non-driver genes harbor random mutations with no functional consequence on cancer. Based on this fact, function prediction based approaches for identifying CDG have been proposed to interrogate the distribution of functional predictions among mutations in cancer genomes (eLS 1–16, 2016). Assuming most of the observed mutations are passenger mutations and given the quantitative predictions for the functional impact of the mutations, genes enriched of functional or deleterious mutations are more likely to be drivers. The promises of these methods have been continually refined and can therefore be applied to increase accuracy in detecting new candidate CDGs. However, current function prediction based approaches only focus on coding mutations and lack a systematic way to pick the best mutation deleteriousness prediction algorithms for usage. RESULTS: In this study, we propose a new function prediction based approach to discover CDGs through a gene-based permutation approach. Our method not only covers both coding and non-coding regions of the genes; but it also accounts for the heterogeneous mutational context in cohort of cancer patients. The permutation model was implemented independently using seven popular deleteriousness prediction scores covering splicing regions (SPIDEX), coding regions (MetaLR, and VEST3) and pan-genome (CADD, DANN, Fathmm-MKL coding and Fathmm-MKL noncoding). We applied this new approach to somatic single nucleotide variants (SNVs) from whole-genome sequences of 119 breast and 24 lung cancer patients and compared the seven deleteriousness prediction scores for their performance in this study. CONCLUSION: The new function prediction based approach not only predicted known cancer genes listed in the Cancer Gene Census (CGC), but also new candidate CDGs that are worth further investigation. The results showed the advantage of utilizing pan-genome deleteriousness prediction scores in function prediction based methods. Although VEST3 score, a deleteriousness prediction score for missense mutations, has the best performance in breast cancer, it was topped by CADD and Fathmm-MKL coding, two pan-genome deleteriousness prediction scores, in lung cancer. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12920-018-0452-9) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6357357
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-63573572019-02-07 Comparison of different functional prediction scores using a gene-based permutation model for identifying cancer driver genes Nono, Alice Djotsa Chen, Ken Liu, Xiaoming BMC Med Genomics Research BACKGROUND: Identifying cancer driver genes (CDG) is a crucial step in cancer genomic toward the advancement of precision medicine. However, driver gene discovery is a very challenging task because we are not only dealing with huge amount of data; but we are also faced with the complexity of the disease including the heterogeneity of background somatic mutation rate in each cancer patient. It is generally accepted that CDG harbor variants conferring growth advantage in the malignant cell and they are positively selected, which are critical to cancer development; whereas, non-driver genes harbor random mutations with no functional consequence on cancer. Based on this fact, function prediction based approaches for identifying CDG have been proposed to interrogate the distribution of functional predictions among mutations in cancer genomes (eLS 1–16, 2016). Assuming most of the observed mutations are passenger mutations and given the quantitative predictions for the functional impact of the mutations, genes enriched of functional or deleterious mutations are more likely to be drivers. The promises of these methods have been continually refined and can therefore be applied to increase accuracy in detecting new candidate CDGs. However, current function prediction based approaches only focus on coding mutations and lack a systematic way to pick the best mutation deleteriousness prediction algorithms for usage. RESULTS: In this study, we propose a new function prediction based approach to discover CDGs through a gene-based permutation approach. Our method not only covers both coding and non-coding regions of the genes; but it also accounts for the heterogeneous mutational context in cohort of cancer patients. The permutation model was implemented independently using seven popular deleteriousness prediction scores covering splicing regions (SPIDEX), coding regions (MetaLR, and VEST3) and pan-genome (CADD, DANN, Fathmm-MKL coding and Fathmm-MKL noncoding). We applied this new approach to somatic single nucleotide variants (SNVs) from whole-genome sequences of 119 breast and 24 lung cancer patients and compared the seven deleteriousness prediction scores for their performance in this study. CONCLUSION: The new function prediction based approach not only predicted known cancer genes listed in the Cancer Gene Census (CGC), but also new candidate CDGs that are worth further investigation. The results showed the advantage of utilizing pan-genome deleteriousness prediction scores in function prediction based methods. Although VEST3 score, a deleteriousness prediction score for missense mutations, has the best performance in breast cancer, it was topped by CADD and Fathmm-MKL coding, two pan-genome deleteriousness prediction scores, in lung cancer. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12920-018-0452-9) contains supplementary material, which is available to authorized users. BioMed Central 2019-01-31 /pmc/articles/PMC6357357/ /pubmed/30704472 http://dx.doi.org/10.1186/s12920-018-0452-9 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Nono, Alice Djotsa
Chen, Ken
Liu, Xiaoming
Comparison of different functional prediction scores using a gene-based permutation model for identifying cancer driver genes
title Comparison of different functional prediction scores using a gene-based permutation model for identifying cancer driver genes
title_full Comparison of different functional prediction scores using a gene-based permutation model for identifying cancer driver genes
title_fullStr Comparison of different functional prediction scores using a gene-based permutation model for identifying cancer driver genes
title_full_unstemmed Comparison of different functional prediction scores using a gene-based permutation model for identifying cancer driver genes
title_short Comparison of different functional prediction scores using a gene-based permutation model for identifying cancer driver genes
title_sort comparison of different functional prediction scores using a gene-based permutation model for identifying cancer driver genes
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6357357/
https://www.ncbi.nlm.nih.gov/pubmed/30704472
http://dx.doi.org/10.1186/s12920-018-0452-9
work_keys_str_mv AT nonoalicedjotsa comparisonofdifferentfunctionalpredictionscoresusingagenebasedpermutationmodelforidentifyingcancerdrivergenes
AT chenken comparisonofdifferentfunctionalpredictionscoresusingagenebasedpermutationmodelforidentifyingcancerdrivergenes
AT liuxiaoming comparisonofdifferentfunctionalpredictionscoresusingagenebasedpermutationmodelforidentifyingcancerdrivergenes