Cargando…

A Novel Method of Characterizing Genetic Sequences: Genome Space with Biological Distance and Applications

BACKGROUND: Most existing methods for phylogenetic analysis involve developing an evolutionary model and then using some type of computational algorithm to perform multiple sequence alignment. There are two problems with this approach: (1) different evolutionary models can lead to different results,...

Descripción completa

Detalles Bibliográficos
Autores principales: Deng, Mo, Yu, Chenglong, Liang, Qian, He, Rong L., Yau, Stephen S.-T.
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3047556/
https://www.ncbi.nlm.nih.gov/pubmed/21399690
http://dx.doi.org/10.1371/journal.pone.0017293
_version_ 1782199043519152128
author Deng, Mo
Yu, Chenglong
Liang, Qian
He, Rong L.
Yau, Stephen S.-T.
author_facet Deng, Mo
Yu, Chenglong
Liang, Qian
He, Rong L.
Yau, Stephen S.-T.
author_sort Deng, Mo
collection PubMed
description BACKGROUND: Most existing methods for phylogenetic analysis involve developing an evolutionary model and then using some type of computational algorithm to perform multiple sequence alignment. There are two problems with this approach: (1) different evolutionary models can lead to different results, and (2) the computation time required for multiple alignments makes it impossible to analyse the phylogeny of a whole genome. This motivates us to create a new approach to characterize genetic sequences. METHODOLOGY: To each DNA sequence, we associate a natural vector based on the distributions of nucleotides. This produces a one-to-one correspondence between the DNA sequence and its natural vector. We define the distance between two DNA sequences to be the distance between their associated natural vectors. This creates a genome space with a biological distance which makes global comparison of genomes with same topology possible. We use our proposed method to analyze the genomes of the new influenza A (H1N1) virus, human rhinoviruses (HRV) and mammalian mitochondrial. The result shows that a triple-reassortant swine virus circulating in North America and the Eurasian swine virus belong to the lineage of the influenza A (H1N1) virus. For the HRV and mammalian mitochondrial genomes, the results coincide with biologists' analyses. CONCLUSIONS: Our approach provides a powerful new tool for analyzing and annotating genomes and their phylogenetic relationships. Whole or partial genomes can be handled more easily and more quickly than using multiple alignment methods. Once a genome space has been constructed, it can be stored in a database. There is no need to reconstruct the genome space for subsequent applications, whereas in multiple alignment methods, realignment is needed to add new sequences. Furthermore, one can make a global comparison of all genomes simultaneously, which no other existing method can achieve.
format Text
id pubmed-3047556
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-30475562011-03-11 A Novel Method of Characterizing Genetic Sequences: Genome Space with Biological Distance and Applications Deng, Mo Yu, Chenglong Liang, Qian He, Rong L. Yau, Stephen S.-T. PLoS One Research Article BACKGROUND: Most existing methods for phylogenetic analysis involve developing an evolutionary model and then using some type of computational algorithm to perform multiple sequence alignment. There are two problems with this approach: (1) different evolutionary models can lead to different results, and (2) the computation time required for multiple alignments makes it impossible to analyse the phylogeny of a whole genome. This motivates us to create a new approach to characterize genetic sequences. METHODOLOGY: To each DNA sequence, we associate a natural vector based on the distributions of nucleotides. This produces a one-to-one correspondence between the DNA sequence and its natural vector. We define the distance between two DNA sequences to be the distance between their associated natural vectors. This creates a genome space with a biological distance which makes global comparison of genomes with same topology possible. We use our proposed method to analyze the genomes of the new influenza A (H1N1) virus, human rhinoviruses (HRV) and mammalian mitochondrial. The result shows that a triple-reassortant swine virus circulating in North America and the Eurasian swine virus belong to the lineage of the influenza A (H1N1) virus. For the HRV and mammalian mitochondrial genomes, the results coincide with biologists' analyses. CONCLUSIONS: Our approach provides a powerful new tool for analyzing and annotating genomes and their phylogenetic relationships. Whole or partial genomes can be handled more easily and more quickly than using multiple alignment methods. Once a genome space has been constructed, it can be stored in a database. There is no need to reconstruct the genome space for subsequent applications, whereas in multiple alignment methods, realignment is needed to add new sequences. Furthermore, one can make a global comparison of all genomes simultaneously, which no other existing method can achieve. Public Library of Science 2011-03-02 /pmc/articles/PMC3047556/ /pubmed/21399690 http://dx.doi.org/10.1371/journal.pone.0017293 Text en Deng et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Deng, Mo
Yu, Chenglong
Liang, Qian
He, Rong L.
Yau, Stephen S.-T.
A Novel Method of Characterizing Genetic Sequences: Genome Space with Biological Distance and Applications
title A Novel Method of Characterizing Genetic Sequences: Genome Space with Biological Distance and Applications
title_full A Novel Method of Characterizing Genetic Sequences: Genome Space with Biological Distance and Applications
title_fullStr A Novel Method of Characterizing Genetic Sequences: Genome Space with Biological Distance and Applications
title_full_unstemmed A Novel Method of Characterizing Genetic Sequences: Genome Space with Biological Distance and Applications
title_short A Novel Method of Characterizing Genetic Sequences: Genome Space with Biological Distance and Applications
title_sort novel method of characterizing genetic sequences: genome space with biological distance and applications
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3047556/
https://www.ncbi.nlm.nih.gov/pubmed/21399690
http://dx.doi.org/10.1371/journal.pone.0017293
work_keys_str_mv AT dengmo anovelmethodofcharacterizinggeneticsequencesgenomespacewithbiologicaldistanceandapplications
AT yuchenglong anovelmethodofcharacterizinggeneticsequencesgenomespacewithbiologicaldistanceandapplications
AT liangqian anovelmethodofcharacterizinggeneticsequencesgenomespacewithbiologicaldistanceandapplications
AT herongl anovelmethodofcharacterizinggeneticsequencesgenomespacewithbiologicaldistanceandapplications
AT yaustephenst anovelmethodofcharacterizinggeneticsequencesgenomespacewithbiologicaldistanceandapplications
AT dengmo novelmethodofcharacterizinggeneticsequencesgenomespacewithbiologicaldistanceandapplications
AT yuchenglong novelmethodofcharacterizinggeneticsequencesgenomespacewithbiologicaldistanceandapplications
AT liangqian novelmethodofcharacterizinggeneticsequencesgenomespacewithbiologicaldistanceandapplications
AT herongl novelmethodofcharacterizinggeneticsequencesgenomespacewithbiologicaldistanceandapplications
AT yaustephenst novelmethodofcharacterizinggeneticsequencesgenomespacewithbiologicaldistanceandapplications