Cargando…

Identification and Analysis of Driver Missense Mutations Using Rotation Forest with Feature Selection

Identifying cancer-associated mutations (driver mutations) is critical for understanding the cellular function of cancer genome that leads to activation of oncogenes or inactivation of tumor suppressor genes. Many approaches are proposed which use supervised machine learning techniques for predictio...

Descripción completa

Detalles Bibliográficos
Autores principales: Du, Xiuquan, Cheng, Jiaxing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4163459/
https://www.ncbi.nlm.nih.gov/pubmed/25250338
http://dx.doi.org/10.1155/2014/905951
_version_ 1782334827604738048
author Du, Xiuquan
Cheng, Jiaxing
author_facet Du, Xiuquan
Cheng, Jiaxing
author_sort Du, Xiuquan
collection PubMed
description Identifying cancer-associated mutations (driver mutations) is critical for understanding the cellular function of cancer genome that leads to activation of oncogenes or inactivation of tumor suppressor genes. Many approaches are proposed which use supervised machine learning techniques for prediction with features obtained by some databases. However, often we do not know which features are important for driver mutations prediction. In this study, we propose a novel feature selection method (called DX) from 126 candidate features' set. In order to obtain the best performance, rotation forest algorithm was adopted to perform the experiment. On the train dataset which was collected from COSMIC and Swiss-Prot databases, we are able to obtain high prediction performance with 88.03% accuracy, 93.9% precision, and 81.35% recall when the 11 top-ranked features were used. Comparison with other various techniques in the TP53, EGFR, and Cosmic2plus datasets shows the generality of our method.
format Online
Article
Text
id pubmed-4163459
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-41634592014-09-23 Identification and Analysis of Driver Missense Mutations Using Rotation Forest with Feature Selection Du, Xiuquan Cheng, Jiaxing Biomed Res Int Research Article Identifying cancer-associated mutations (driver mutations) is critical for understanding the cellular function of cancer genome that leads to activation of oncogenes or inactivation of tumor suppressor genes. Many approaches are proposed which use supervised machine learning techniques for prediction with features obtained by some databases. However, often we do not know which features are important for driver mutations prediction. In this study, we propose a novel feature selection method (called DX) from 126 candidate features' set. In order to obtain the best performance, rotation forest algorithm was adopted to perform the experiment. On the train dataset which was collected from COSMIC and Swiss-Prot databases, we are able to obtain high prediction performance with 88.03% accuracy, 93.9% precision, and 81.35% recall when the 11 top-ranked features were used. Comparison with other various techniques in the TP53, EGFR, and Cosmic2plus datasets shows the generality of our method. Hindawi Publishing Corporation 2014 2014-08-27 /pmc/articles/PMC4163459/ /pubmed/25250338 http://dx.doi.org/10.1155/2014/905951 Text en Copyright © 2014 X. Du and J. Cheng. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Du, Xiuquan
Cheng, Jiaxing
Identification and Analysis of Driver Missense Mutations Using Rotation Forest with Feature Selection
title Identification and Analysis of Driver Missense Mutations Using Rotation Forest with Feature Selection
title_full Identification and Analysis of Driver Missense Mutations Using Rotation Forest with Feature Selection
title_fullStr Identification and Analysis of Driver Missense Mutations Using Rotation Forest with Feature Selection
title_full_unstemmed Identification and Analysis of Driver Missense Mutations Using Rotation Forest with Feature Selection
title_short Identification and Analysis of Driver Missense Mutations Using Rotation Forest with Feature Selection
title_sort identification and analysis of driver missense mutations using rotation forest with feature selection
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4163459/
https://www.ncbi.nlm.nih.gov/pubmed/25250338
http://dx.doi.org/10.1155/2014/905951
work_keys_str_mv AT duxiuquan identificationandanalysisofdrivermissensemutationsusingrotationforestwithfeatureselection
AT chengjiaxing identificationandanalysisofdrivermissensemutationsusingrotationforestwithfeatureselection