Cargando…
PVTree: A Sequential Pattern Mining Method for Alignment Independent Phylogeny Reconstruction
Phylogenetic tree is essential to understand evolution and it is usually constructed through multiple sequence alignment, which suffers from heavy computational burdens and requires sophisticated parameter tuning. Recently, alignment free methods based on k-mer profiles or common substrings provide...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6410268/ https://www.ncbi.nlm.nih.gov/pubmed/30678245 http://dx.doi.org/10.3390/genes10020073 |
_version_ | 1783402207611191296 |
---|---|
author | Kang, Yongyong Yang, Xiaofei Lin, Jiadong Ye, Kai |
author_facet | Kang, Yongyong Yang, Xiaofei Lin, Jiadong Ye, Kai |
author_sort | Kang, Yongyong |
collection | PubMed |
description | Phylogenetic tree is essential to understand evolution and it is usually constructed through multiple sequence alignment, which suffers from heavy computational burdens and requires sophisticated parameter tuning. Recently, alignment free methods based on k-mer profiles or common substrings provide alternative ways to construct phylogenetic trees. However, most of these methods ignore the global similarities between sequences or some specific valuable features, e.g., frequent patterns overall datasets. To make further improvement, we propose an alignment free algorithm based on sequential pattern mining, where each sequence is converted into a binary representation of sequential patterns among sequences. The phylogenetic tree is further constructed via clustering distance matrix which is calculated from pattern vectors. To increase accuracy for highly divergent sequences, we consider pattern weight and filtering redundancy sub-patterns. Both simulated and real data demonstrates our method outperform other alignment free methods, especially for large sequence set with low similarity. |
format | Online Article Text |
id | pubmed-6410268 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-64102682019-03-26 PVTree: A Sequential Pattern Mining Method for Alignment Independent Phylogeny Reconstruction Kang, Yongyong Yang, Xiaofei Lin, Jiadong Ye, Kai Genes (Basel) Article Phylogenetic tree is essential to understand evolution and it is usually constructed through multiple sequence alignment, which suffers from heavy computational burdens and requires sophisticated parameter tuning. Recently, alignment free methods based on k-mer profiles or common substrings provide alternative ways to construct phylogenetic trees. However, most of these methods ignore the global similarities between sequences or some specific valuable features, e.g., frequent patterns overall datasets. To make further improvement, we propose an alignment free algorithm based on sequential pattern mining, where each sequence is converted into a binary representation of sequential patterns among sequences. The phylogenetic tree is further constructed via clustering distance matrix which is calculated from pattern vectors. To increase accuracy for highly divergent sequences, we consider pattern weight and filtering redundancy sub-patterns. Both simulated and real data demonstrates our method outperform other alignment free methods, especially for large sequence set with low similarity. MDPI 2019-01-22 /pmc/articles/PMC6410268/ /pubmed/30678245 http://dx.doi.org/10.3390/genes10020073 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Kang, Yongyong Yang, Xiaofei Lin, Jiadong Ye, Kai PVTree: A Sequential Pattern Mining Method for Alignment Independent Phylogeny Reconstruction |
title | PVTree: A Sequential Pattern Mining Method for Alignment Independent Phylogeny Reconstruction |
title_full | PVTree: A Sequential Pattern Mining Method for Alignment Independent Phylogeny Reconstruction |
title_fullStr | PVTree: A Sequential Pattern Mining Method for Alignment Independent Phylogeny Reconstruction |
title_full_unstemmed | PVTree: A Sequential Pattern Mining Method for Alignment Independent Phylogeny Reconstruction |
title_short | PVTree: A Sequential Pattern Mining Method for Alignment Independent Phylogeny Reconstruction |
title_sort | pvtree: a sequential pattern mining method for alignment independent phylogeny reconstruction |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6410268/ https://www.ncbi.nlm.nih.gov/pubmed/30678245 http://dx.doi.org/10.3390/genes10020073 |
work_keys_str_mv | AT kangyongyong pvtreeasequentialpatternminingmethodforalignmentindependentphylogenyreconstruction AT yangxiaofei pvtreeasequentialpatternminingmethodforalignmentindependentphylogenyreconstruction AT linjiadong pvtreeasequentialpatternminingmethodforalignmentindependentphylogenyreconstruction AT yekai pvtreeasequentialpatternminingmethodforalignmentindependentphylogenyreconstruction |