Cargando…

Detecting Network Communities: An Application to Phylogenetic Analysis

This paper proposes a new method to identify communities in generally weighted complex networks and apply it to phylogenetic analysis. In this case, weights correspond to the similarity indexes among protein sequences, which can be used for network construction so that the network structure can be a...

Descripción completa

Detalles Bibliográficos
Autores principales: Andrade, Roberto F. S., Rocha-Neto, Ivan C., Santos, Leonardo B. L., de Santana, Charles N., Diniz, Marcelo V. C., Lobão, Thierry Petit, Goés-Neto, Aristóteles, Pinho, Suani T. R., El-Hani, Charbel N.
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3088654/
https://www.ncbi.nlm.nih.gov/pubmed/21573202
http://dx.doi.org/10.1371/journal.pcbi.1001131
_version_ 1782202910746083328
author Andrade, Roberto F. S.
Rocha-Neto, Ivan C.
Santos, Leonardo B. L.
de Santana, Charles N.
Diniz, Marcelo V. C.
Lobão, Thierry Petit
Goés-Neto, Aristóteles
Pinho, Suani T. R.
El-Hani, Charbel N.
author_facet Andrade, Roberto F. S.
Rocha-Neto, Ivan C.
Santos, Leonardo B. L.
de Santana, Charles N.
Diniz, Marcelo V. C.
Lobão, Thierry Petit
Goés-Neto, Aristóteles
Pinho, Suani T. R.
El-Hani, Charbel N.
author_sort Andrade, Roberto F. S.
collection PubMed
description This paper proposes a new method to identify communities in generally weighted complex networks and apply it to phylogenetic analysis. In this case, weights correspond to the similarity indexes among protein sequences, which can be used for network construction so that the network structure can be analyzed to recover phylogenetically useful information from its properties. The analyses discussed here are mainly based on the modular character of protein similarity networks, explored through the Newman-Girvan algorithm, with the help of the neighborhood matrix [Image: see text]. The most relevant networks are found when the network topology changes abruptly revealing distinct modules related to the sets of organisms to which the proteins belong. Sound biological information can be retrieved by the computational routines used in the network approach, without using biological assumptions other than those incorporated by BLAST. Usually, all the main bacterial phyla and, in some cases, also some bacterial classes corresponded totally (100%) or to a great extent (>70%) to the modules. We checked for internal consistency in the obtained results, and we scored close to 84% of matches for community pertinence when comparisons between the results were performed. To illustrate how to use the network-based method, we employed data for enzymes involved in the chitin metabolic pathway that are present in more than 100 organisms from an original data set containing 1,695 organisms, downloaded from GenBank on May 19, 2007. A preliminary comparison between the outcomes of the network-based method and the results of methods based on Bayesian, distance, likelihood, and parsimony criteria suggests that the former is as reliable as these commonly used methods. We conclude that the network-based method can be used as a powerful tool for retrieving modularity information from weighted networks, which is useful for phylogenetic analysis.
format Text
id pubmed-3088654
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-30886542011-05-13 Detecting Network Communities: An Application to Phylogenetic Analysis Andrade, Roberto F. S. Rocha-Neto, Ivan C. Santos, Leonardo B. L. de Santana, Charles N. Diniz, Marcelo V. C. Lobão, Thierry Petit Goés-Neto, Aristóteles Pinho, Suani T. R. El-Hani, Charbel N. PLoS Comput Biol Research Article This paper proposes a new method to identify communities in generally weighted complex networks and apply it to phylogenetic analysis. In this case, weights correspond to the similarity indexes among protein sequences, which can be used for network construction so that the network structure can be analyzed to recover phylogenetically useful information from its properties. The analyses discussed here are mainly based on the modular character of protein similarity networks, explored through the Newman-Girvan algorithm, with the help of the neighborhood matrix [Image: see text]. The most relevant networks are found when the network topology changes abruptly revealing distinct modules related to the sets of organisms to which the proteins belong. Sound biological information can be retrieved by the computational routines used in the network approach, without using biological assumptions other than those incorporated by BLAST. Usually, all the main bacterial phyla and, in some cases, also some bacterial classes corresponded totally (100%) or to a great extent (>70%) to the modules. We checked for internal consistency in the obtained results, and we scored close to 84% of matches for community pertinence when comparisons between the results were performed. To illustrate how to use the network-based method, we employed data for enzymes involved in the chitin metabolic pathway that are present in more than 100 organisms from an original data set containing 1,695 organisms, downloaded from GenBank on May 19, 2007. A preliminary comparison between the outcomes of the network-based method and the results of methods based on Bayesian, distance, likelihood, and parsimony criteria suggests that the former is as reliable as these commonly used methods. We conclude that the network-based method can be used as a powerful tool for retrieving modularity information from weighted networks, which is useful for phylogenetic analysis. Public Library of Science 2011-05-05 /pmc/articles/PMC3088654/ /pubmed/21573202 http://dx.doi.org/10.1371/journal.pcbi.1001131 Text en Andrade et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Andrade, Roberto F. S.
Rocha-Neto, Ivan C.
Santos, Leonardo B. L.
de Santana, Charles N.
Diniz, Marcelo V. C.
Lobão, Thierry Petit
Goés-Neto, Aristóteles
Pinho, Suani T. R.
El-Hani, Charbel N.
Detecting Network Communities: An Application to Phylogenetic Analysis
title Detecting Network Communities: An Application to Phylogenetic Analysis
title_full Detecting Network Communities: An Application to Phylogenetic Analysis
title_fullStr Detecting Network Communities: An Application to Phylogenetic Analysis
title_full_unstemmed Detecting Network Communities: An Application to Phylogenetic Analysis
title_short Detecting Network Communities: An Application to Phylogenetic Analysis
title_sort detecting network communities: an application to phylogenetic analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3088654/
https://www.ncbi.nlm.nih.gov/pubmed/21573202
http://dx.doi.org/10.1371/journal.pcbi.1001131
work_keys_str_mv AT andraderobertofs detectingnetworkcommunitiesanapplicationtophylogeneticanalysis
AT rochanetoivanc detectingnetworkcommunitiesanapplicationtophylogeneticanalysis
AT santosleonardobl detectingnetworkcommunitiesanapplicationtophylogeneticanalysis
AT desantanacharlesn detectingnetworkcommunitiesanapplicationtophylogeneticanalysis
AT dinizmarcelovc detectingnetworkcommunitiesanapplicationtophylogeneticanalysis
AT lobaothierrypetit detectingnetworkcommunitiesanapplicationtophylogeneticanalysis
AT goesnetoaristoteles detectingnetworkcommunitiesanapplicationtophylogeneticanalysis
AT pinhosuanitr detectingnetworkcommunitiesanapplicationtophylogeneticanalysis
AT elhanicharbeln detectingnetworkcommunitiesanapplicationtophylogeneticanalysis