Cargando…

Application of Gap-Constraints Given Sequential Frequent Pattern Mining for Protein Function Prediction

OBJECTIVES: Predicting protein function from the protein–protein interaction network is challenging due to its complexity and huge scale of protein interaction process along with inconsistent pattern. Previously proposed methods such as neighbor counting, network analysis, and graph pattern mining h...

Descripción completa

Detalles Bibliográficos
Autores principales: Park, Hyeon Ah, Kim, Taewook, Li, Meijing, Shon, Ho Sun, Park, Jeong Seok, Ryu, Keun Ho
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4411351/
https://www.ncbi.nlm.nih.gov/pubmed/25938021
http://dx.doi.org/10.1016/j.phrp.2015.01.006
_version_ 1782368458828152832
author Park, Hyeon Ah
Kim, Taewook
Li, Meijing
Shon, Ho Sun
Park, Jeong Seok
Ryu, Keun Ho
author_facet Park, Hyeon Ah
Kim, Taewook
Li, Meijing
Shon, Ho Sun
Park, Jeong Seok
Ryu, Keun Ho
author_sort Park, Hyeon Ah
collection PubMed
description OBJECTIVES: Predicting protein function from the protein–protein interaction network is challenging due to its complexity and huge scale of protein interaction process along with inconsistent pattern. Previously proposed methods such as neighbor counting, network analysis, and graph pattern mining has predicted functions by calculating the rules and probability of patterns inside network. Although these methods have shown good prediction, difficulty still exists in searching several functions that are exceptional from simple rules and patterns as a result of not considering the inconsistent aspect of the interaction network. METHODS: In this article, we propose a novel approach using the sequential pattern mining method with gap-constraints. To overcome the inconsistency problem, we suggest frequent functional patterns to include every possible functional sequence—including patterns for which search is limited by the structure of connection or level of neighborhood layer. We also constructed a tree-graph with the most crucial interaction information of the target protein, and generated candidate sets to assign by sequential pattern mining allowing gaps. RESULTS: The parameters of pattern length, maximum gaps, and minimum support were given to find the best setting for the most accurate prediction. The highest accuracy rate was 0.972, which showed better results than the simple neighbor counting approach and link-based approach. CONCLUSION: The results comparison with other approaches has confirmed that the proposed approach could reach more function candidates that previous methods could not obtain.
format Online
Article
Text
id pubmed-4411351
institution National Center for Biotechnology Information
language English
publishDate 2015
record_format MEDLINE/PubMed
spelling pubmed-44113512015-05-01 Application of Gap-Constraints Given Sequential Frequent Pattern Mining for Protein Function Prediction Park, Hyeon Ah Kim, Taewook Li, Meijing Shon, Ho Sun Park, Jeong Seok Ryu, Keun Ho Osong Public Health Res Perspect Original Article OBJECTIVES: Predicting protein function from the protein–protein interaction network is challenging due to its complexity and huge scale of protein interaction process along with inconsistent pattern. Previously proposed methods such as neighbor counting, network analysis, and graph pattern mining has predicted functions by calculating the rules and probability of patterns inside network. Although these methods have shown good prediction, difficulty still exists in searching several functions that are exceptional from simple rules and patterns as a result of not considering the inconsistent aspect of the interaction network. METHODS: In this article, we propose a novel approach using the sequential pattern mining method with gap-constraints. To overcome the inconsistency problem, we suggest frequent functional patterns to include every possible functional sequence—including patterns for which search is limited by the structure of connection or level of neighborhood layer. We also constructed a tree-graph with the most crucial interaction information of the target protein, and generated candidate sets to assign by sequential pattern mining allowing gaps. RESULTS: The parameters of pattern length, maximum gaps, and minimum support were given to find the best setting for the most accurate prediction. The highest accuracy rate was 0.972, which showed better results than the simple neighbor counting approach and link-based approach. CONCLUSION: The results comparison with other approaches has confirmed that the proposed approach could reach more function candidates that previous methods could not obtain. 2015-02-24 2015-04 /pmc/articles/PMC4411351/ /pubmed/25938021 http://dx.doi.org/10.1016/j.phrp.2015.01.006 Text en © 2015 Published by Elsevier B.V. on behalf of Korea Centers for Disease Control and Prevention. http://creativecommons.org/licenses/by-nc/3.0/ This is an Open Access article distributed under the terms of the CC-BY-NC License (http://creativecommons.org/licenses/by-nc/3.0).
spellingShingle Original Article
Park, Hyeon Ah
Kim, Taewook
Li, Meijing
Shon, Ho Sun
Park, Jeong Seok
Ryu, Keun Ho
Application of Gap-Constraints Given Sequential Frequent Pattern Mining for Protein Function Prediction
title Application of Gap-Constraints Given Sequential Frequent Pattern Mining for Protein Function Prediction
title_full Application of Gap-Constraints Given Sequential Frequent Pattern Mining for Protein Function Prediction
title_fullStr Application of Gap-Constraints Given Sequential Frequent Pattern Mining for Protein Function Prediction
title_full_unstemmed Application of Gap-Constraints Given Sequential Frequent Pattern Mining for Protein Function Prediction
title_short Application of Gap-Constraints Given Sequential Frequent Pattern Mining for Protein Function Prediction
title_sort application of gap-constraints given sequential frequent pattern mining for protein function prediction
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4411351/
https://www.ncbi.nlm.nih.gov/pubmed/25938021
http://dx.doi.org/10.1016/j.phrp.2015.01.006
work_keys_str_mv AT parkhyeonah applicationofgapconstraintsgivensequentialfrequentpatternminingforproteinfunctionprediction
AT kimtaewook applicationofgapconstraintsgivensequentialfrequentpatternminingforproteinfunctionprediction
AT limeijing applicationofgapconstraintsgivensequentialfrequentpatternminingforproteinfunctionprediction
AT shonhosun applicationofgapconstraintsgivensequentialfrequentpatternminingforproteinfunctionprediction
AT parkjeongseok applicationofgapconstraintsgivensequentialfrequentpatternminingforproteinfunctionprediction
AT ryukeunho applicationofgapconstraintsgivensequentialfrequentpatternminingforproteinfunctionprediction