Cargando…

Comparison of topological clustering within protein networks using edge metrics that evaluate full sequence, full structure, and active site microenvironment similarity

The development of accurate protein function annotation methods has emerged as a major unsolved biological problem. Protein similarity networks, one approach to function annotation via annotation transfer, group proteins into similarity-based clusters. An underlying assumption is that the edge metri...

Descripción completa

Detalles Bibliográficos
Autores principales: Leuthaeuser, Janelle B, Knutson, Stacy T, Kumar, Kiran, Babbitt, Patricia C, Fetrow, Jacquelyn S
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley & Sons, Ltd 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4570537/
https://www.ncbi.nlm.nih.gov/pubmed/26073648
http://dx.doi.org/10.1002/pro.2724
_version_ 1782390220220530688
author Leuthaeuser, Janelle B
Knutson, Stacy T
Kumar, Kiran
Babbitt, Patricia C
Fetrow, Jacquelyn S
author_facet Leuthaeuser, Janelle B
Knutson, Stacy T
Kumar, Kiran
Babbitt, Patricia C
Fetrow, Jacquelyn S
author_sort Leuthaeuser, Janelle B
collection PubMed
description The development of accurate protein function annotation methods has emerged as a major unsolved biological problem. Protein similarity networks, one approach to function annotation via annotation transfer, group proteins into similarity-based clusters. An underlying assumption is that the edge metric used to identify such clusters correlates with functional information. In this contribution, this assumption is evaluated by observing topologies in similarity networks using three different edge metrics: sequence (BLAST), structure (TM-Align), and active site similarity (active site profiling, implemented in DASP). Network topologies for four well-studied protein superfamilies (enolase, peroxiredoxin (Prx), glutathione transferase (GST), and crotonase) were compared with curated functional hierarchies and structure. As expected, network topology differs, depending on edge metric; comparison of topologies provides valuable information on structure/function relationships. Subnetworks based on active site similarity correlate with known functional hierarchies at a single edge threshold more often than sequence- or structure-based networks. Sequence- and structure-based networks are useful for identifying sequence and domain similarities and differences; therefore, it is important to consider the clustering goal before deciding appropriate edge metric. Further, conserved active site residues identified in enolase and GST active site subnetworks correspond with published functionally important residues. Extension of this analysis yields predictions of functionally determinant residues for GST subgroups. These results support the hypothesis that active site similarity-based networks reveal clusters that share functional details and lay the foundation for capturing functionally relevant hierarchies using an approach that is both automatable and can deliver greater precision in function annotation than current similarity-based methods.
format Online
Article
Text
id pubmed-4570537
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher John Wiley & Sons, Ltd
record_format MEDLINE/PubMed
spelling pubmed-45705372015-09-21 Comparison of topological clustering within protein networks using edge metrics that evaluate full sequence, full structure, and active site microenvironment similarity Leuthaeuser, Janelle B Knutson, Stacy T Kumar, Kiran Babbitt, Patricia C Fetrow, Jacquelyn S Protein Sci Articles The development of accurate protein function annotation methods has emerged as a major unsolved biological problem. Protein similarity networks, one approach to function annotation via annotation transfer, group proteins into similarity-based clusters. An underlying assumption is that the edge metric used to identify such clusters correlates with functional information. In this contribution, this assumption is evaluated by observing topologies in similarity networks using three different edge metrics: sequence (BLAST), structure (TM-Align), and active site similarity (active site profiling, implemented in DASP). Network topologies for four well-studied protein superfamilies (enolase, peroxiredoxin (Prx), glutathione transferase (GST), and crotonase) were compared with curated functional hierarchies and structure. As expected, network topology differs, depending on edge metric; comparison of topologies provides valuable information on structure/function relationships. Subnetworks based on active site similarity correlate with known functional hierarchies at a single edge threshold more often than sequence- or structure-based networks. Sequence- and structure-based networks are useful for identifying sequence and domain similarities and differences; therefore, it is important to consider the clustering goal before deciding appropriate edge metric. Further, conserved active site residues identified in enolase and GST active site subnetworks correspond with published functionally important residues. Extension of this analysis yields predictions of functionally determinant residues for GST subgroups. These results support the hypothesis that active site similarity-based networks reveal clusters that share functional details and lay the foundation for capturing functionally relevant hierarchies using an approach that is both automatable and can deliver greater precision in function annotation than current similarity-based methods. John Wiley & Sons, Ltd 2015-09 2015-06-12 /pmc/articles/PMC4570537/ /pubmed/26073648 http://dx.doi.org/10.1002/pro.2724 Text en © 2015 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society http://creativecommons.org/licenses/by-nc/4.0/ This is an open access article under the terms of the Creative Commons Attribution-NonCommercial License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.
spellingShingle Articles
Leuthaeuser, Janelle B
Knutson, Stacy T
Kumar, Kiran
Babbitt, Patricia C
Fetrow, Jacquelyn S
Comparison of topological clustering within protein networks using edge metrics that evaluate full sequence, full structure, and active site microenvironment similarity
title Comparison of topological clustering within protein networks using edge metrics that evaluate full sequence, full structure, and active site microenvironment similarity
title_full Comparison of topological clustering within protein networks using edge metrics that evaluate full sequence, full structure, and active site microenvironment similarity
title_fullStr Comparison of topological clustering within protein networks using edge metrics that evaluate full sequence, full structure, and active site microenvironment similarity
title_full_unstemmed Comparison of topological clustering within protein networks using edge metrics that evaluate full sequence, full structure, and active site microenvironment similarity
title_short Comparison of topological clustering within protein networks using edge metrics that evaluate full sequence, full structure, and active site microenvironment similarity
title_sort comparison of topological clustering within protein networks using edge metrics that evaluate full sequence, full structure, and active site microenvironment similarity
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4570537/
https://www.ncbi.nlm.nih.gov/pubmed/26073648
http://dx.doi.org/10.1002/pro.2724
work_keys_str_mv AT leuthaeuserjanelleb comparisonoftopologicalclusteringwithinproteinnetworksusingedgemetricsthatevaluatefullsequencefullstructureandactivesitemicroenvironmentsimilarity
AT knutsonstacyt comparisonoftopologicalclusteringwithinproteinnetworksusingedgemetricsthatevaluatefullsequencefullstructureandactivesitemicroenvironmentsimilarity
AT kumarkiran comparisonoftopologicalclusteringwithinproteinnetworksusingedgemetricsthatevaluatefullsequencefullstructureandactivesitemicroenvironmentsimilarity
AT babbittpatriciac comparisonoftopologicalclusteringwithinproteinnetworksusingedgemetricsthatevaluatefullsequencefullstructureandactivesitemicroenvironmentsimilarity
AT fetrowjacquelyns comparisonoftopologicalclusteringwithinproteinnetworksusingedgemetricsthatevaluatefullsequencefullstructureandactivesitemicroenvironmentsimilarity