Cargando…
A Novel Model for DNA Sequence Similarity Analysis Based on Graph Theory
Determination of sequence similarity is one of the major steps in computational phylogenetic studies. As we know, during evolutionary history, not only DNA mutations for individual nucleotide but also subsequent rearrangements occurred. It has been one of major tasks of computational biologists to d...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Libertas Academica
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3204935/ https://www.ncbi.nlm.nih.gov/pubmed/22065497 http://dx.doi.org/10.4137/EBO.S7364 |
_version_ | 1782215264580927488 |
---|---|
author | Qi, Xingqin Wu, Qin Zhang, Yusen Fuller, Eddie Zhang, Cun-Quan |
author_facet | Qi, Xingqin Wu, Qin Zhang, Yusen Fuller, Eddie Zhang, Cun-Quan |
author_sort | Qi, Xingqin |
collection | PubMed |
description | Determination of sequence similarity is one of the major steps in computational phylogenetic studies. As we know, during evolutionary history, not only DNA mutations for individual nucleotide but also subsequent rearrangements occurred. It has been one of major tasks of computational biologists to develop novel mathematical descriptors for similarity analysis such that various mutation phenomena information would be involved simultaneously. In this paper, different from traditional methods (eg, nucleotide frequency, geometric representations) as bases for construction of mathematical descriptors, we construct novel mathematical descriptors based on graph theory. In particular, for each DNA sequence, we will set up a weighted directed graph. The adjacency matrix of the directed graph will be used to induce a representative vector for DNA sequence. This new approach measures similarity based on both ordering and frequency of nucleotides so that much more information is involved. As an application, the method is tested on a set of 0.9-kb mtDNA sequences of twelve different primate species. All output phylogenetic trees with various distance estimations have the same topology, and are generally consistent with the reported results from early studies, which proves the new method’s efficiency; we also test the new method on a simulated data set, which shows our new method performs better than traditional global alignment method when subsequent rearrangements happen frequently during evolutionary history. |
format | Online Article Text |
id | pubmed-3204935 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | Libertas Academica |
record_format | MEDLINE/PubMed |
spelling | pubmed-32049352011-11-04 A Novel Model for DNA Sequence Similarity Analysis Based on Graph Theory Qi, Xingqin Wu, Qin Zhang, Yusen Fuller, Eddie Zhang, Cun-Quan Evol Bioinform Online Original Research Determination of sequence similarity is one of the major steps in computational phylogenetic studies. As we know, during evolutionary history, not only DNA mutations for individual nucleotide but also subsequent rearrangements occurred. It has been one of major tasks of computational biologists to develop novel mathematical descriptors for similarity analysis such that various mutation phenomena information would be involved simultaneously. In this paper, different from traditional methods (eg, nucleotide frequency, geometric representations) as bases for construction of mathematical descriptors, we construct novel mathematical descriptors based on graph theory. In particular, for each DNA sequence, we will set up a weighted directed graph. The adjacency matrix of the directed graph will be used to induce a representative vector for DNA sequence. This new approach measures similarity based on both ordering and frequency of nucleotides so that much more information is involved. As an application, the method is tested on a set of 0.9-kb mtDNA sequences of twelve different primate species. All output phylogenetic trees with various distance estimations have the same topology, and are generally consistent with the reported results from early studies, which proves the new method’s efficiency; we also test the new method on a simulated data set, which shows our new method performs better than traditional global alignment method when subsequent rearrangements happen frequently during evolutionary history. Libertas Academica 2011-10-04 /pmc/articles/PMC3204935/ /pubmed/22065497 http://dx.doi.org/10.4137/EBO.S7364 Text en © the author(s), publisher and licensee Libertas Academica Ltd. This is an open access article. Unrestricted non-commercial use is permitted provided the original work is properly cited. |
spellingShingle | Original Research Qi, Xingqin Wu, Qin Zhang, Yusen Fuller, Eddie Zhang, Cun-Quan A Novel Model for DNA Sequence Similarity Analysis Based on Graph Theory |
title | A Novel Model for DNA Sequence Similarity Analysis Based on Graph Theory |
title_full | A Novel Model for DNA Sequence Similarity Analysis Based on Graph Theory |
title_fullStr | A Novel Model for DNA Sequence Similarity Analysis Based on Graph Theory |
title_full_unstemmed | A Novel Model for DNA Sequence Similarity Analysis Based on Graph Theory |
title_short | A Novel Model for DNA Sequence Similarity Analysis Based on Graph Theory |
title_sort | novel model for dna sequence similarity analysis based on graph theory |
topic | Original Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3204935/ https://www.ncbi.nlm.nih.gov/pubmed/22065497 http://dx.doi.org/10.4137/EBO.S7364 |
work_keys_str_mv | AT qixingqin anovelmodelfordnasequencesimilarityanalysisbasedongraphtheory AT wuqin anovelmodelfordnasequencesimilarityanalysisbasedongraphtheory AT zhangyusen anovelmodelfordnasequencesimilarityanalysisbasedongraphtheory AT fullereddie anovelmodelfordnasequencesimilarityanalysisbasedongraphtheory AT zhangcunquan anovelmodelfordnasequencesimilarityanalysisbasedongraphtheory AT qixingqin novelmodelfordnasequencesimilarityanalysisbasedongraphtheory AT wuqin novelmodelfordnasequencesimilarityanalysisbasedongraphtheory AT zhangyusen novelmodelfordnasequencesimilarityanalysisbasedongraphtheory AT fullereddie novelmodelfordnasequencesimilarityanalysisbasedongraphtheory AT zhangcunquan novelmodelfordnasequencesimilarityanalysisbasedongraphtheory |