Cargando…

Automatic construction of molecular similarity networks for visual graph mining in chemical space of bioactive peptides: an unsupervised learning approach

The increasing interest in bioactive peptides with therapeutic potentials has been reflected in a large variety of biological databases published over the last years. However, the knowledge discovery process from these heterogeneous data sources is a nontrivial task, becoming the essence of our rese...

Descripción completa

Detalles Bibliográficos
Autores principales: Aguilera-Mendoza, Longendri, Marrero-Ponce, Yovani, García-Jacas, César R., Chavez, Edgar, Beltran, Jesus A., Guillen-Ramirez, Hugo A., Brizuela, Carlos A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7583304/
https://www.ncbi.nlm.nih.gov/pubmed/33093586
http://dx.doi.org/10.1038/s41598-020-75029-1
_version_ 1783599374140440576
author Aguilera-Mendoza, Longendri
Marrero-Ponce, Yovani
García-Jacas, César R.
Chavez, Edgar
Beltran, Jesus A.
Guillen-Ramirez, Hugo A.
Brizuela, Carlos A.
author_facet Aguilera-Mendoza, Longendri
Marrero-Ponce, Yovani
García-Jacas, César R.
Chavez, Edgar
Beltran, Jesus A.
Guillen-Ramirez, Hugo A.
Brizuela, Carlos A.
author_sort Aguilera-Mendoza, Longendri
collection PubMed
description The increasing interest in bioactive peptides with therapeutic potentials has been reflected in a large variety of biological databases published over the last years. However, the knowledge discovery process from these heterogeneous data sources is a nontrivial task, becoming the essence of our research endeavor. Therefore, we devise a unified data model based on molecular similarity networks for representing a chemical reference space of bioactive peptides, having an implicit knowledge that is currently not explicitly accessed in existing biological databases. Indeed, our main contribution is a novel workflow for the automatic construction of such similarity networks, enabling visual graph mining techniques to uncover new insights from the “ocean” of known bioactive peptides. The workflow presented here relies on the following sequential steps: (i) calculation of molecular descriptors by applying statistical and aggregation operators on amino acid property vectors; (ii) a two-stage unsupervised feature selection method to identify an optimized subset of descriptors using the concepts of entropy and mutual information; (iii) generation of sparse networks where nodes represent bioactive peptides, and edges between two nodes denote their pairwise similarity/distance relationships in the defined descriptor space; and (iv) exploratory analysis using visual inspection in combination with clustering and network science techniques. For practical purposes, the proposed workflow has been implemented in our visual analytics software tool (http://mobiosd-hub.com/starpep/), to assist researchers in extracting useful information from an integrated collection of 45120 bioactive peptides, which is one of the largest and most diverse data in its field. Finally, we illustrate the applicability of the proposed workflow for discovering central nodes in molecular similarity networks that may represent a biologically relevant chemical space known to date.
format Online
Article
Text
id pubmed-7583304
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-75833042020-10-27 Automatic construction of molecular similarity networks for visual graph mining in chemical space of bioactive peptides: an unsupervised learning approach Aguilera-Mendoza, Longendri Marrero-Ponce, Yovani García-Jacas, César R. Chavez, Edgar Beltran, Jesus A. Guillen-Ramirez, Hugo A. Brizuela, Carlos A. Sci Rep Article The increasing interest in bioactive peptides with therapeutic potentials has been reflected in a large variety of biological databases published over the last years. However, the knowledge discovery process from these heterogeneous data sources is a nontrivial task, becoming the essence of our research endeavor. Therefore, we devise a unified data model based on molecular similarity networks for representing a chemical reference space of bioactive peptides, having an implicit knowledge that is currently not explicitly accessed in existing biological databases. Indeed, our main contribution is a novel workflow for the automatic construction of such similarity networks, enabling visual graph mining techniques to uncover new insights from the “ocean” of known bioactive peptides. The workflow presented here relies on the following sequential steps: (i) calculation of molecular descriptors by applying statistical and aggregation operators on amino acid property vectors; (ii) a two-stage unsupervised feature selection method to identify an optimized subset of descriptors using the concepts of entropy and mutual information; (iii) generation of sparse networks where nodes represent bioactive peptides, and edges between two nodes denote their pairwise similarity/distance relationships in the defined descriptor space; and (iv) exploratory analysis using visual inspection in combination with clustering and network science techniques. For practical purposes, the proposed workflow has been implemented in our visual analytics software tool (http://mobiosd-hub.com/starpep/), to assist researchers in extracting useful information from an integrated collection of 45120 bioactive peptides, which is one of the largest and most diverse data in its field. Finally, we illustrate the applicability of the proposed workflow for discovering central nodes in molecular similarity networks that may represent a biologically relevant chemical space known to date. Nature Publishing Group UK 2020-10-22 /pmc/articles/PMC7583304/ /pubmed/33093586 http://dx.doi.org/10.1038/s41598-020-75029-1 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Aguilera-Mendoza, Longendri
Marrero-Ponce, Yovani
García-Jacas, César R.
Chavez, Edgar
Beltran, Jesus A.
Guillen-Ramirez, Hugo A.
Brizuela, Carlos A.
Automatic construction of molecular similarity networks for visual graph mining in chemical space of bioactive peptides: an unsupervised learning approach
title Automatic construction of molecular similarity networks for visual graph mining in chemical space of bioactive peptides: an unsupervised learning approach
title_full Automatic construction of molecular similarity networks for visual graph mining in chemical space of bioactive peptides: an unsupervised learning approach
title_fullStr Automatic construction of molecular similarity networks for visual graph mining in chemical space of bioactive peptides: an unsupervised learning approach
title_full_unstemmed Automatic construction of molecular similarity networks for visual graph mining in chemical space of bioactive peptides: an unsupervised learning approach
title_short Automatic construction of molecular similarity networks for visual graph mining in chemical space of bioactive peptides: an unsupervised learning approach
title_sort automatic construction of molecular similarity networks for visual graph mining in chemical space of bioactive peptides: an unsupervised learning approach
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7583304/
https://www.ncbi.nlm.nih.gov/pubmed/33093586
http://dx.doi.org/10.1038/s41598-020-75029-1
work_keys_str_mv AT aguileramendozalongendri automaticconstructionofmolecularsimilaritynetworksforvisualgraphmininginchemicalspaceofbioactivepeptidesanunsupervisedlearningapproach
AT marreroponceyovani automaticconstructionofmolecularsimilaritynetworksforvisualgraphmininginchemicalspaceofbioactivepeptidesanunsupervisedlearningapproach
AT garciajacascesarr automaticconstructionofmolecularsimilaritynetworksforvisualgraphmininginchemicalspaceofbioactivepeptidesanunsupervisedlearningapproach
AT chavezedgar automaticconstructionofmolecularsimilaritynetworksforvisualgraphmininginchemicalspaceofbioactivepeptidesanunsupervisedlearningapproach
AT beltranjesusa automaticconstructionofmolecularsimilaritynetworksforvisualgraphmininginchemicalspaceofbioactivepeptidesanunsupervisedlearningapproach
AT guillenramirezhugoa automaticconstructionofmolecularsimilaritynetworksforvisualgraphmininginchemicalspaceofbioactivepeptidesanunsupervisedlearningapproach
AT brizuelacarlosa automaticconstructionofmolecularsimilaritynetworksforvisualgraphmininginchemicalspaceofbioactivepeptidesanunsupervisedlearningapproach