Cargando…
Detecting Network Communities: An Application to Phylogenetic Analysis
This paper proposes a new method to identify communities in generally weighted complex networks and apply it to phylogenetic analysis. In this case, weights correspond to the similarity indexes among protein sequences, which can be used for network construction so that the network structure can be a...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3088654/ https://www.ncbi.nlm.nih.gov/pubmed/21573202 http://dx.doi.org/10.1371/journal.pcbi.1001131 |
_version_ | 1782202910746083328 |
---|---|
author | Andrade, Roberto F. S. Rocha-Neto, Ivan C. Santos, Leonardo B. L. de Santana, Charles N. Diniz, Marcelo V. C. Lobão, Thierry Petit Goés-Neto, Aristóteles Pinho, Suani T. R. El-Hani, Charbel N. |
author_facet | Andrade, Roberto F. S. Rocha-Neto, Ivan C. Santos, Leonardo B. L. de Santana, Charles N. Diniz, Marcelo V. C. Lobão, Thierry Petit Goés-Neto, Aristóteles Pinho, Suani T. R. El-Hani, Charbel N. |
author_sort | Andrade, Roberto F. S. |
collection | PubMed |
description | This paper proposes a new method to identify communities in generally weighted complex networks and apply it to phylogenetic analysis. In this case, weights correspond to the similarity indexes among protein sequences, which can be used for network construction so that the network structure can be analyzed to recover phylogenetically useful information from its properties. The analyses discussed here are mainly based on the modular character of protein similarity networks, explored through the Newman-Girvan algorithm, with the help of the neighborhood matrix [Image: see text]. The most relevant networks are found when the network topology changes abruptly revealing distinct modules related to the sets of organisms to which the proteins belong. Sound biological information can be retrieved by the computational routines used in the network approach, without using biological assumptions other than those incorporated by BLAST. Usually, all the main bacterial phyla and, in some cases, also some bacterial classes corresponded totally (100%) or to a great extent (>70%) to the modules. We checked for internal consistency in the obtained results, and we scored close to 84% of matches for community pertinence when comparisons between the results were performed. To illustrate how to use the network-based method, we employed data for enzymes involved in the chitin metabolic pathway that are present in more than 100 organisms from an original data set containing 1,695 organisms, downloaded from GenBank on May 19, 2007. A preliminary comparison between the outcomes of the network-based method and the results of methods based on Bayesian, distance, likelihood, and parsimony criteria suggests that the former is as reliable as these commonly used methods. We conclude that the network-based method can be used as a powerful tool for retrieving modularity information from weighted networks, which is useful for phylogenetic analysis. |
format | Text |
id | pubmed-3088654 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-30886542011-05-13 Detecting Network Communities: An Application to Phylogenetic Analysis Andrade, Roberto F. S. Rocha-Neto, Ivan C. Santos, Leonardo B. L. de Santana, Charles N. Diniz, Marcelo V. C. Lobão, Thierry Petit Goés-Neto, Aristóteles Pinho, Suani T. R. El-Hani, Charbel N. PLoS Comput Biol Research Article This paper proposes a new method to identify communities in generally weighted complex networks and apply it to phylogenetic analysis. In this case, weights correspond to the similarity indexes among protein sequences, which can be used for network construction so that the network structure can be analyzed to recover phylogenetically useful information from its properties. The analyses discussed here are mainly based on the modular character of protein similarity networks, explored through the Newman-Girvan algorithm, with the help of the neighborhood matrix [Image: see text]. The most relevant networks are found when the network topology changes abruptly revealing distinct modules related to the sets of organisms to which the proteins belong. Sound biological information can be retrieved by the computational routines used in the network approach, without using biological assumptions other than those incorporated by BLAST. Usually, all the main bacterial phyla and, in some cases, also some bacterial classes corresponded totally (100%) or to a great extent (>70%) to the modules. We checked for internal consistency in the obtained results, and we scored close to 84% of matches for community pertinence when comparisons between the results were performed. To illustrate how to use the network-based method, we employed data for enzymes involved in the chitin metabolic pathway that are present in more than 100 organisms from an original data set containing 1,695 organisms, downloaded from GenBank on May 19, 2007. A preliminary comparison between the outcomes of the network-based method and the results of methods based on Bayesian, distance, likelihood, and parsimony criteria suggests that the former is as reliable as these commonly used methods. We conclude that the network-based method can be used as a powerful tool for retrieving modularity information from weighted networks, which is useful for phylogenetic analysis. Public Library of Science 2011-05-05 /pmc/articles/PMC3088654/ /pubmed/21573202 http://dx.doi.org/10.1371/journal.pcbi.1001131 Text en Andrade et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Andrade, Roberto F. S. Rocha-Neto, Ivan C. Santos, Leonardo B. L. de Santana, Charles N. Diniz, Marcelo V. C. Lobão, Thierry Petit Goés-Neto, Aristóteles Pinho, Suani T. R. El-Hani, Charbel N. Detecting Network Communities: An Application to Phylogenetic Analysis |
title | Detecting Network Communities: An Application to Phylogenetic
Analysis |
title_full | Detecting Network Communities: An Application to Phylogenetic
Analysis |
title_fullStr | Detecting Network Communities: An Application to Phylogenetic
Analysis |
title_full_unstemmed | Detecting Network Communities: An Application to Phylogenetic
Analysis |
title_short | Detecting Network Communities: An Application to Phylogenetic
Analysis |
title_sort | detecting network communities: an application to phylogenetic
analysis |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3088654/ https://www.ncbi.nlm.nih.gov/pubmed/21573202 http://dx.doi.org/10.1371/journal.pcbi.1001131 |
work_keys_str_mv | AT andraderobertofs detectingnetworkcommunitiesanapplicationtophylogeneticanalysis AT rochanetoivanc detectingnetworkcommunitiesanapplicationtophylogeneticanalysis AT santosleonardobl detectingnetworkcommunitiesanapplicationtophylogeneticanalysis AT desantanacharlesn detectingnetworkcommunitiesanapplicationtophylogeneticanalysis AT dinizmarcelovc detectingnetworkcommunitiesanapplicationtophylogeneticanalysis AT lobaothierrypetit detectingnetworkcommunitiesanapplicationtophylogeneticanalysis AT goesnetoaristoteles detectingnetworkcommunitiesanapplicationtophylogeneticanalysis AT pinhosuanitr detectingnetworkcommunitiesanapplicationtophylogeneticanalysis AT elhanicharbeln detectingnetworkcommunitiesanapplicationtophylogeneticanalysis |