Cargando…

Google Matrix Analysis of DNA Sequences

For DNA sequences of various species we construct the Google matrix [Image: see text] of Markov transitions between nearby words composed of several letters. The statistical distribution of matrix elements of this matrix is shown to be described by a power law with the exponent being close to those...

Descripción completa

Detalles Bibliográficos
Autores principales: Kandiah, Vivek, Shepelyansky, Dima L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3650020/
https://www.ncbi.nlm.nih.gov/pubmed/23671568
http://dx.doi.org/10.1371/journal.pone.0061519
_version_ 1782269060460838912
author Kandiah, Vivek
Shepelyansky, Dima L.
author_facet Kandiah, Vivek
Shepelyansky, Dima L.
author_sort Kandiah, Vivek
collection PubMed
description For DNA sequences of various species we construct the Google matrix [Image: see text] of Markov transitions between nearby words composed of several letters. The statistical distribution of matrix elements of this matrix is shown to be described by a power law with the exponent being close to those of outgoing links in such scale-free networks as the World Wide Web (WWW). At the same time the sum of ingoing matrix elements is characterized by the exponent being significantly larger than those typical for WWW networks. This results in a slow algebraic decay of the PageRank probability determined by the distribution of ingoing elements. The spectrum of [Image: see text] is characterized by a large gap leading to a rapid relaxation process on the DNA sequence networks. We introduce the PageRank proximity correlator between different species which determines their statistical similarity from the view point of Markov chains. The properties of other eigenstates of the Google matrix are also discussed. Our results establish scale-free features of DNA sequence networks showing their similarities and distinctions with the WWW and linguistic networks.
format Online
Article
Text
id pubmed-3650020
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-36500202013-05-13 Google Matrix Analysis of DNA Sequences Kandiah, Vivek Shepelyansky, Dima L. PLoS One Research Article For DNA sequences of various species we construct the Google matrix [Image: see text] of Markov transitions between nearby words composed of several letters. The statistical distribution of matrix elements of this matrix is shown to be described by a power law with the exponent being close to those of outgoing links in such scale-free networks as the World Wide Web (WWW). At the same time the sum of ingoing matrix elements is characterized by the exponent being significantly larger than those typical for WWW networks. This results in a slow algebraic decay of the PageRank probability determined by the distribution of ingoing elements. The spectrum of [Image: see text] is characterized by a large gap leading to a rapid relaxation process on the DNA sequence networks. We introduce the PageRank proximity correlator between different species which determines their statistical similarity from the view point of Markov chains. The properties of other eigenstates of the Google matrix are also discussed. Our results establish scale-free features of DNA sequence networks showing their similarities and distinctions with the WWW and linguistic networks. Public Library of Science 2013-05-09 /pmc/articles/PMC3650020/ /pubmed/23671568 http://dx.doi.org/10.1371/journal.pone.0061519 Text en © 2013 Kandiah, Shepelyansky http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Kandiah, Vivek
Shepelyansky, Dima L.
Google Matrix Analysis of DNA Sequences
title Google Matrix Analysis of DNA Sequences
title_full Google Matrix Analysis of DNA Sequences
title_fullStr Google Matrix Analysis of DNA Sequences
title_full_unstemmed Google Matrix Analysis of DNA Sequences
title_short Google Matrix Analysis of DNA Sequences
title_sort google matrix analysis of dna sequences
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3650020/
https://www.ncbi.nlm.nih.gov/pubmed/23671568
http://dx.doi.org/10.1371/journal.pone.0061519
work_keys_str_mv AT kandiahvivek googlematrixanalysisofdnasequences
AT shepelyanskydimal googlematrixanalysisofdnasequences