Cargando…
A graph theoretic approach to utilizing protein structure to identify non-random somatic mutations
BACKGROUND: It is well known that the development of cancer is caused by the accumulation of somatic mutations within the genome. For oncogenes specifically, current research suggests that there is a small set of "driver" mutations that are primarily responsible for tumorigenesis. Further,...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4024121/ https://www.ncbi.nlm.nih.gov/pubmed/24669769 http://dx.doi.org/10.1186/1471-2105-15-86 |
_version_ | 1782316622296383488 |
---|---|
author | Ryslik, Gregory A Cheng, Yuwei Cheung, Kei-Hoi Modis, Yorgo Zhao, Hongyu |
author_facet | Ryslik, Gregory A Cheng, Yuwei Cheung, Kei-Hoi Modis, Yorgo Zhao, Hongyu |
author_sort | Ryslik, Gregory A |
collection | PubMed |
description | BACKGROUND: It is well known that the development of cancer is caused by the accumulation of somatic mutations within the genome. For oncogenes specifically, current research suggests that there is a small set of "driver" mutations that are primarily responsible for tumorigenesis. Further, due to recent pharmacological successes in treating these driver mutations and their resulting tumors, a variety of approaches have been developed to identify potential driver mutations using methods such as machine learning and mutational clustering. We propose a novel methodology that increases our power to identify mutational clusters by taking into account protein tertiary structure via a graph theoretical approach. RESULTS: We have designed and implemented GraphPAC (Graph Protein Amino acid Clustering) to identify mutational clustering while considering protein spatial structure. Using GraphPAC, we are able to detect novel clusters in proteins that are known to exhibit mutation clustering as well as identify clusters in proteins without evidence of prior clustering based on current methods. Specifically, by utilizing the spatial information available in the Protein Data Bank (PDB) along with the mutational data in the Catalogue of Somatic Mutations in Cancer (COSMIC), GraphPAC identifies new mutational clusters in well known oncogenes such as EGFR and KRAS. Further, by utilizing graph theory to account for the tertiary structure, GraphPAC discovers clusters in DPP4, NRP1 and other proteins not identified by existing methods. The R package is available at: http://bioconductor.org/packages/release/bioc/html/GraphPAC.html. CONCLUSION: GraphPAC provides an alternative to iPAC and an extension to current methodology when identifying potential activating driver mutations by utilizing a graph theoretic approach when considering protein tertiary structure. |
format | Online Article Text |
id | pubmed-4024121 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-40241212014-05-28 A graph theoretic approach to utilizing protein structure to identify non-random somatic mutations Ryslik, Gregory A Cheng, Yuwei Cheung, Kei-Hoi Modis, Yorgo Zhao, Hongyu BMC Bioinformatics Methodology Article BACKGROUND: It is well known that the development of cancer is caused by the accumulation of somatic mutations within the genome. For oncogenes specifically, current research suggests that there is a small set of "driver" mutations that are primarily responsible for tumorigenesis. Further, due to recent pharmacological successes in treating these driver mutations and their resulting tumors, a variety of approaches have been developed to identify potential driver mutations using methods such as machine learning and mutational clustering. We propose a novel methodology that increases our power to identify mutational clusters by taking into account protein tertiary structure via a graph theoretical approach. RESULTS: We have designed and implemented GraphPAC (Graph Protein Amino acid Clustering) to identify mutational clustering while considering protein spatial structure. Using GraphPAC, we are able to detect novel clusters in proteins that are known to exhibit mutation clustering as well as identify clusters in proteins without evidence of prior clustering based on current methods. Specifically, by utilizing the spatial information available in the Protein Data Bank (PDB) along with the mutational data in the Catalogue of Somatic Mutations in Cancer (COSMIC), GraphPAC identifies new mutational clusters in well known oncogenes such as EGFR and KRAS. Further, by utilizing graph theory to account for the tertiary structure, GraphPAC discovers clusters in DPP4, NRP1 and other proteins not identified by existing methods. The R package is available at: http://bioconductor.org/packages/release/bioc/html/GraphPAC.html. CONCLUSION: GraphPAC provides an alternative to iPAC and an extension to current methodology when identifying potential activating driver mutations by utilizing a graph theoretic approach when considering protein tertiary structure. BioMed Central 2014-03-26 /pmc/articles/PMC4024121/ /pubmed/24669769 http://dx.doi.org/10.1186/1471-2105-15-86 Text en Copyright © 2014 Ryslik et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. |
spellingShingle | Methodology Article Ryslik, Gregory A Cheng, Yuwei Cheung, Kei-Hoi Modis, Yorgo Zhao, Hongyu A graph theoretic approach to utilizing protein structure to identify non-random somatic mutations |
title | A graph theoretic approach to utilizing protein structure to identify non-random somatic mutations |
title_full | A graph theoretic approach to utilizing protein structure to identify non-random somatic mutations |
title_fullStr | A graph theoretic approach to utilizing protein structure to identify non-random somatic mutations |
title_full_unstemmed | A graph theoretic approach to utilizing protein structure to identify non-random somatic mutations |
title_short | A graph theoretic approach to utilizing protein structure to identify non-random somatic mutations |
title_sort | graph theoretic approach to utilizing protein structure to identify non-random somatic mutations |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4024121/ https://www.ncbi.nlm.nih.gov/pubmed/24669769 http://dx.doi.org/10.1186/1471-2105-15-86 |
work_keys_str_mv | AT ryslikgregorya agraphtheoreticapproachtoutilizingproteinstructuretoidentifynonrandomsomaticmutations AT chengyuwei agraphtheoreticapproachtoutilizingproteinstructuretoidentifynonrandomsomaticmutations AT cheungkeihoi agraphtheoreticapproachtoutilizingproteinstructuretoidentifynonrandomsomaticmutations AT modisyorgo agraphtheoreticapproachtoutilizingproteinstructuretoidentifynonrandomsomaticmutations AT zhaohongyu agraphtheoreticapproachtoutilizingproteinstructuretoidentifynonrandomsomaticmutations AT ryslikgregorya graphtheoreticapproachtoutilizingproteinstructuretoidentifynonrandomsomaticmutations AT chengyuwei graphtheoreticapproachtoutilizingproteinstructuretoidentifynonrandomsomaticmutations AT cheungkeihoi graphtheoreticapproachtoutilizingproteinstructuretoidentifynonrandomsomaticmutations AT modisyorgo graphtheoreticapproachtoutilizingproteinstructuretoidentifynonrandomsomaticmutations AT zhaohongyu graphtheoreticapproachtoutilizingproteinstructuretoidentifynonrandomsomaticmutations |