Cargando…

A graph theoretic approach to utilizing protein structure to identify non-random somatic mutations

BACKGROUND: It is well known that the development of cancer is caused by the accumulation of somatic mutations within the genome. For oncogenes specifically, current research suggests that there is a small set of "driver" mutations that are primarily responsible for tumorigenesis. Further,...

Descripción completa

Detalles Bibliográficos
Autores principales: Ryslik, Gregory A, Cheng, Yuwei, Cheung, Kei-Hoi, Modis, Yorgo, Zhao, Hongyu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4024121/
https://www.ncbi.nlm.nih.gov/pubmed/24669769
http://dx.doi.org/10.1186/1471-2105-15-86
_version_ 1782316622296383488
author Ryslik, Gregory A
Cheng, Yuwei
Cheung, Kei-Hoi
Modis, Yorgo
Zhao, Hongyu
author_facet Ryslik, Gregory A
Cheng, Yuwei
Cheung, Kei-Hoi
Modis, Yorgo
Zhao, Hongyu
author_sort Ryslik, Gregory A
collection PubMed
description BACKGROUND: It is well known that the development of cancer is caused by the accumulation of somatic mutations within the genome. For oncogenes specifically, current research suggests that there is a small set of "driver" mutations that are primarily responsible for tumorigenesis. Further, due to recent pharmacological successes in treating these driver mutations and their resulting tumors, a variety of approaches have been developed to identify potential driver mutations using methods such as machine learning and mutational clustering. We propose a novel methodology that increases our power to identify mutational clusters by taking into account protein tertiary structure via a graph theoretical approach. RESULTS: We have designed and implemented GraphPAC (Graph Protein Amino acid Clustering) to identify mutational clustering while considering protein spatial structure. Using GraphPAC, we are able to detect novel clusters in proteins that are known to exhibit mutation clustering as well as identify clusters in proteins without evidence of prior clustering based on current methods. Specifically, by utilizing the spatial information available in the Protein Data Bank (PDB) along with the mutational data in the Catalogue of Somatic Mutations in Cancer (COSMIC), GraphPAC identifies new mutational clusters in well known oncogenes such as EGFR and KRAS. Further, by utilizing graph theory to account for the tertiary structure, GraphPAC discovers clusters in DPP4, NRP1 and other proteins not identified by existing methods. The R package is available at: http://bioconductor.org/packages/release/bioc/html/GraphPAC.html. CONCLUSION: GraphPAC provides an alternative to iPAC and an extension to current methodology when identifying potential activating driver mutations by utilizing a graph theoretic approach when considering protein tertiary structure.
format Online
Article
Text
id pubmed-4024121
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-40241212014-05-28 A graph theoretic approach to utilizing protein structure to identify non-random somatic mutations Ryslik, Gregory A Cheng, Yuwei Cheung, Kei-Hoi Modis, Yorgo Zhao, Hongyu BMC Bioinformatics Methodology Article BACKGROUND: It is well known that the development of cancer is caused by the accumulation of somatic mutations within the genome. For oncogenes specifically, current research suggests that there is a small set of "driver" mutations that are primarily responsible for tumorigenesis. Further, due to recent pharmacological successes in treating these driver mutations and their resulting tumors, a variety of approaches have been developed to identify potential driver mutations using methods such as machine learning and mutational clustering. We propose a novel methodology that increases our power to identify mutational clusters by taking into account protein tertiary structure via a graph theoretical approach. RESULTS: We have designed and implemented GraphPAC (Graph Protein Amino acid Clustering) to identify mutational clustering while considering protein spatial structure. Using GraphPAC, we are able to detect novel clusters in proteins that are known to exhibit mutation clustering as well as identify clusters in proteins without evidence of prior clustering based on current methods. Specifically, by utilizing the spatial information available in the Protein Data Bank (PDB) along with the mutational data in the Catalogue of Somatic Mutations in Cancer (COSMIC), GraphPAC identifies new mutational clusters in well known oncogenes such as EGFR and KRAS. Further, by utilizing graph theory to account for the tertiary structure, GraphPAC discovers clusters in DPP4, NRP1 and other proteins not identified by existing methods. The R package is available at: http://bioconductor.org/packages/release/bioc/html/GraphPAC.html. CONCLUSION: GraphPAC provides an alternative to iPAC and an extension to current methodology when identifying potential activating driver mutations by utilizing a graph theoretic approach when considering protein tertiary structure. BioMed Central 2014-03-26 /pmc/articles/PMC4024121/ /pubmed/24669769 http://dx.doi.org/10.1186/1471-2105-15-86 Text en Copyright © 2014 Ryslik et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.
spellingShingle Methodology Article
Ryslik, Gregory A
Cheng, Yuwei
Cheung, Kei-Hoi
Modis, Yorgo
Zhao, Hongyu
A graph theoretic approach to utilizing protein structure to identify non-random somatic mutations
title A graph theoretic approach to utilizing protein structure to identify non-random somatic mutations
title_full A graph theoretic approach to utilizing protein structure to identify non-random somatic mutations
title_fullStr A graph theoretic approach to utilizing protein structure to identify non-random somatic mutations
title_full_unstemmed A graph theoretic approach to utilizing protein structure to identify non-random somatic mutations
title_short A graph theoretic approach to utilizing protein structure to identify non-random somatic mutations
title_sort graph theoretic approach to utilizing protein structure to identify non-random somatic mutations
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4024121/
https://www.ncbi.nlm.nih.gov/pubmed/24669769
http://dx.doi.org/10.1186/1471-2105-15-86
work_keys_str_mv AT ryslikgregorya agraphtheoreticapproachtoutilizingproteinstructuretoidentifynonrandomsomaticmutations
AT chengyuwei agraphtheoreticapproachtoutilizingproteinstructuretoidentifynonrandomsomaticmutations
AT cheungkeihoi agraphtheoreticapproachtoutilizingproteinstructuretoidentifynonrandomsomaticmutations
AT modisyorgo agraphtheoreticapproachtoutilizingproteinstructuretoidentifynonrandomsomaticmutations
AT zhaohongyu agraphtheoreticapproachtoutilizingproteinstructuretoidentifynonrandomsomaticmutations
AT ryslikgregorya graphtheoreticapproachtoutilizingproteinstructuretoidentifynonrandomsomaticmutations
AT chengyuwei graphtheoreticapproachtoutilizingproteinstructuretoidentifynonrandomsomaticmutations
AT cheungkeihoi graphtheoreticapproachtoutilizingproteinstructuretoidentifynonrandomsomaticmutations
AT modisyorgo graphtheoreticapproachtoutilizingproteinstructuretoidentifynonrandomsomaticmutations
AT zhaohongyu graphtheoreticapproachtoutilizingproteinstructuretoidentifynonrandomsomaticmutations