Cargando…

SeqNLS: Nuclear Localization Signal Prediction Based on Frequent Pattern Mining and Linear Motif Scoring

Nuclear localization signals (NLSs) are stretches of residues in proteins mediating their importing into the nucleus. NLSs are known to have diverse patterns, of which only a limited number are covered by currently known NLS motifs. Here we propose a sequential pattern mining algorithm SeqNLS to eff...

Descripción completa

Detalles Bibliográficos
Autores principales: Lin, Jhih-rong, Hu, Jianjun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3812174/
https://www.ncbi.nlm.nih.gov/pubmed/24204689
http://dx.doi.org/10.1371/journal.pone.0076864
_version_ 1782288946798002176
author Lin, Jhih-rong
Hu, Jianjun
author_facet Lin, Jhih-rong
Hu, Jianjun
author_sort Lin, Jhih-rong
collection PubMed
description Nuclear localization signals (NLSs) are stretches of residues in proteins mediating their importing into the nucleus. NLSs are known to have diverse patterns, of which only a limited number are covered by currently known NLS motifs. Here we propose a sequential pattern mining algorithm SeqNLS to effectively identify potential NLS patterns without being constrained by the limitation of current knowledge of NLSs. The extracted frequent sequential patterns are used to predict NLS candidates which are then filtered by a linear motif-scoring scheme based on predicted sequence disorder and by the relatively local conservation (IRLC) based masking. The experiment results on the newly curated Yeast and Hybrid datasets show that SeqNLS is effective in detecting potential NLSs. The performance comparison between SeqNLS with and without the linear motif scoring shows that linear motif features are highly complementary to sequence features in discerning NLSs. For the two independent datasets, our SeqNLS not only can consistently find over 50% of NLSs with prediction precision of at least 0.7, but also outperforms other state-of-the-art NLS prediction methods in terms of F1 score or prediction precision with similar or higher recall rates. The web server of the SeqNLS algorithm is available at http://mleg.cse.sc.edu/seqNLS.
format Online
Article
Text
id pubmed-3812174
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-38121742013-11-07 SeqNLS: Nuclear Localization Signal Prediction Based on Frequent Pattern Mining and Linear Motif Scoring Lin, Jhih-rong Hu, Jianjun PLoS One Research Article Nuclear localization signals (NLSs) are stretches of residues in proteins mediating their importing into the nucleus. NLSs are known to have diverse patterns, of which only a limited number are covered by currently known NLS motifs. Here we propose a sequential pattern mining algorithm SeqNLS to effectively identify potential NLS patterns without being constrained by the limitation of current knowledge of NLSs. The extracted frequent sequential patterns are used to predict NLS candidates which are then filtered by a linear motif-scoring scheme based on predicted sequence disorder and by the relatively local conservation (IRLC) based masking. The experiment results on the newly curated Yeast and Hybrid datasets show that SeqNLS is effective in detecting potential NLSs. The performance comparison between SeqNLS with and without the linear motif scoring shows that linear motif features are highly complementary to sequence features in discerning NLSs. For the two independent datasets, our SeqNLS not only can consistently find over 50% of NLSs with prediction precision of at least 0.7, but also outperforms other state-of-the-art NLS prediction methods in terms of F1 score or prediction precision with similar or higher recall rates. The web server of the SeqNLS algorithm is available at http://mleg.cse.sc.edu/seqNLS. Public Library of Science 2013-10-29 /pmc/articles/PMC3812174/ /pubmed/24204689 http://dx.doi.org/10.1371/journal.pone.0076864 Text en © 2013 Lin, Hu http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Lin, Jhih-rong
Hu, Jianjun
SeqNLS: Nuclear Localization Signal Prediction Based on Frequent Pattern Mining and Linear Motif Scoring
title SeqNLS: Nuclear Localization Signal Prediction Based on Frequent Pattern Mining and Linear Motif Scoring
title_full SeqNLS: Nuclear Localization Signal Prediction Based on Frequent Pattern Mining and Linear Motif Scoring
title_fullStr SeqNLS: Nuclear Localization Signal Prediction Based on Frequent Pattern Mining and Linear Motif Scoring
title_full_unstemmed SeqNLS: Nuclear Localization Signal Prediction Based on Frequent Pattern Mining and Linear Motif Scoring
title_short SeqNLS: Nuclear Localization Signal Prediction Based on Frequent Pattern Mining and Linear Motif Scoring
title_sort seqnls: nuclear localization signal prediction based on frequent pattern mining and linear motif scoring
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3812174/
https://www.ncbi.nlm.nih.gov/pubmed/24204689
http://dx.doi.org/10.1371/journal.pone.0076864
work_keys_str_mv AT linjhihrong seqnlsnuclearlocalizationsignalpredictionbasedonfrequentpatternminingandlinearmotifscoring
AT hujianjun seqnlsnuclearlocalizationsignalpredictionbasedonfrequentpatternminingandlinearmotifscoring