Cargando…

Visualizing the superfamily of metallo-β-lactamases through sequence similarity network neighborhood connectivity analysis

Protein sequence similarity networks (SSNs) constitute a convenient approach to analyze large polypeptide sequence datasets, and have been successfully applied to study a number of protein families over the past decade. SSN analysis is herein combined with traditional cladistic and phenetic phylogen...

Descripción completa

Detalles Bibliográficos
Autor principal: González, Javier M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7785958/
https://www.ncbi.nlm.nih.gov/pubmed/33426353
http://dx.doi.org/10.1016/j.heliyon.2020.e05867
Descripción
Sumario:Protein sequence similarity networks (SSNs) constitute a convenient approach to analyze large polypeptide sequence datasets, and have been successfully applied to study a number of protein families over the past decade. SSN analysis is herein combined with traditional cladistic and phenetic phylogenetic analysis (respectively based on multiple sequence alignments and all-against-all three-dimensional protein structure comparisons) in order to assist the ancestral reconstruction and integrative revision of the superfamily of metallo-β-lactamases (MBLs). It is shown that only 198 out of 15,292 representative nodes contain at least one experimentally obtained protein structure in the Protein Data Bank or a manually annotated SwissProt entry, that is to say, only 1.3 % of the superfamily has been functionally and/or structurally characterized. Besides, neighborhood connectivity coloring, which measures local network interconnectivity, is introduced for detection of protein families within SSN clusters. This approach provides a clear picture of how many families remain unexplored in the superfamily, while most MBL research is heavily biased towards a few families. Further research is suggested in order to determine the SSN topological properties, which will be instrumental for the improvement of automated sequence annotation methods.