Cargando…
A Novel Efficient Graph Model for the Multiple Longest Common Subsequences (MLCS) Problem
Searching for the Multiple Longest Common Subsequences (MLCS) of multiple sequences is a classical NP-hard problem, which has been used in many applications. One of the most effective exact approaches for the MLCS problem is based on dominant point graph, which is a kind of directed acyclic graph (D...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5552671/ https://www.ncbi.nlm.nih.gov/pubmed/28848600 http://dx.doi.org/10.3389/fgene.2017.00104 |
_version_ | 1783256490648272896 |
---|---|
author | Peng, Zhan Wang, Yuping |
author_facet | Peng, Zhan Wang, Yuping |
author_sort | Peng, Zhan |
collection | PubMed |
description | Searching for the Multiple Longest Common Subsequences (MLCS) of multiple sequences is a classical NP-hard problem, which has been used in many applications. One of the most effective exact approaches for the MLCS problem is based on dominant point graph, which is a kind of directed acyclic graph (DAG). However, the time and space efficiency of the leading dominant point graph based approaches is still unsatisfactory: constructing the dominated point graph used by these approaches requires a huge amount of time and space, which hinders the applications of these approaches to large-scale and long sequences. To address this issue, in this paper, we propose a new time and space efficient graph model called the Leveled-DAG for the MLCS problem. The Leveled-DAG can timely eliminate all the nodes in the graph that cannot contribute to the construction of MLCS during constructing. At any moment, only the current level and some previously generated nodes in the graph need to be kept in memory, which can greatly reduce the memory consumption. Also, the final graph contains only one node in which all of the wanted MLCS are saved, thus, no additional operations for searching the MLCS are needed. The experiments are conducted on real biological sequences with different numbers and lengths respectively, and the proposed algorithm is compared with three state-of-the-art algorithms. The experimental results show that the time and space needed for the Leveled-DAG approach are smaller than those for the compared algorithms especially on large-scale and long sequences. |
format | Online Article Text |
id | pubmed-5552671 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-55526712017-08-28 A Novel Efficient Graph Model for the Multiple Longest Common Subsequences (MLCS) Problem Peng, Zhan Wang, Yuping Front Genet Genetics Searching for the Multiple Longest Common Subsequences (MLCS) of multiple sequences is a classical NP-hard problem, which has been used in many applications. One of the most effective exact approaches for the MLCS problem is based on dominant point graph, which is a kind of directed acyclic graph (DAG). However, the time and space efficiency of the leading dominant point graph based approaches is still unsatisfactory: constructing the dominated point graph used by these approaches requires a huge amount of time and space, which hinders the applications of these approaches to large-scale and long sequences. To address this issue, in this paper, we propose a new time and space efficient graph model called the Leveled-DAG for the MLCS problem. The Leveled-DAG can timely eliminate all the nodes in the graph that cannot contribute to the construction of MLCS during constructing. At any moment, only the current level and some previously generated nodes in the graph need to be kept in memory, which can greatly reduce the memory consumption. Also, the final graph contains only one node in which all of the wanted MLCS are saved, thus, no additional operations for searching the MLCS are needed. The experiments are conducted on real biological sequences with different numbers and lengths respectively, and the proposed algorithm is compared with three state-of-the-art algorithms. The experimental results show that the time and space needed for the Leveled-DAG approach are smaller than those for the compared algorithms especially on large-scale and long sequences. Frontiers Media S.A. 2017-08-09 /pmc/articles/PMC5552671/ /pubmed/28848600 http://dx.doi.org/10.3389/fgene.2017.00104 Text en Copyright © 2017 Peng and Wang. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Peng, Zhan Wang, Yuping A Novel Efficient Graph Model for the Multiple Longest Common Subsequences (MLCS) Problem |
title | A Novel Efficient Graph Model for the Multiple Longest Common Subsequences (MLCS) Problem |
title_full | A Novel Efficient Graph Model for the Multiple Longest Common Subsequences (MLCS) Problem |
title_fullStr | A Novel Efficient Graph Model for the Multiple Longest Common Subsequences (MLCS) Problem |
title_full_unstemmed | A Novel Efficient Graph Model for the Multiple Longest Common Subsequences (MLCS) Problem |
title_short | A Novel Efficient Graph Model for the Multiple Longest Common Subsequences (MLCS) Problem |
title_sort | novel efficient graph model for the multiple longest common subsequences (mlcs) problem |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5552671/ https://www.ncbi.nlm.nih.gov/pubmed/28848600 http://dx.doi.org/10.3389/fgene.2017.00104 |
work_keys_str_mv | AT pengzhan anovelefficientgraphmodelforthemultiplelongestcommonsubsequencesmlcsproblem AT wangyuping anovelefficientgraphmodelforthemultiplelongestcommonsubsequencesmlcsproblem AT pengzhan novelefficientgraphmodelforthemultiplelongestcommonsubsequencesmlcsproblem AT wangyuping novelefficientgraphmodelforthemultiplelongestcommonsubsequencesmlcsproblem |