Cargando…
PyBioMed: a python library for various molecular representations of chemicals, proteins and DNAs and their interactions
BACKGROUND: With the increasing development of biotechnology and informatics technology, publicly available data in chemistry and biology are undergoing explosive growth. Such wealthy information in these data needs to be extracted and transformed to useful knowledge by various data mining methods....
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer International Publishing
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5861255/ https://www.ncbi.nlm.nih.gov/pubmed/29556758 http://dx.doi.org/10.1186/s13321-018-0270-2 |
_version_ | 1783308061438377984 |
---|---|
author | Dong, Jie Yao, Zhi-Jiang Zhang, Lin Luo, Feijun Lin, Qinlu Lu, Ai-Ping Chen, Alex F. Cao, Dong-Sheng |
author_facet | Dong, Jie Yao, Zhi-Jiang Zhang, Lin Luo, Feijun Lin, Qinlu Lu, Ai-Ping Chen, Alex F. Cao, Dong-Sheng |
author_sort | Dong, Jie |
collection | PubMed |
description | BACKGROUND: With the increasing development of biotechnology and informatics technology, publicly available data in chemistry and biology are undergoing explosive growth. Such wealthy information in these data needs to be extracted and transformed to useful knowledge by various data mining methods. Considering the amazing rate at which data are accumulated in chemistry and biology fields, new tools that process and interpret large and complex interaction data are increasingly important. So far, there are no suitable toolkits that can effectively link the chemical and biological space in view of molecular representation. To further explore these complex data, an integrated toolkit for various molecular representation is urgently needed which could be easily integrated with data mining algorithms to start a full data analysis pipeline. RESULTS: Herein, the python library PyBioMed is presented, which comprises functionalities for online download for various molecular objects by providing different IDs, the pretreatment of molecular structures, the computation of various molecular descriptors for chemicals, proteins, DNAs and their interactions. PyBioMed is a feature-rich and highly customized python library used for the characterization of various complex chemical and biological molecules and interaction samples. The current version of PyBioMed could calculate 775 chemical descriptors and 19 kinds of chemical fingerprints, 9920 protein descriptors based on protein sequences, more than 6000 DNA descriptors from nucleotide sequences, and interaction descriptors from pairwise samples using three different combining strategies. Several examples and five real-life applications were provided to clearly guide the users how to use PyBioMed as an integral part of data analysis projects. By using PyBioMed, users are able to start a full pipelining from getting molecular data, pretreating molecules, molecular representation to constructing machine learning models conveniently. CONCLUSION: PyBioMed provides various user-friendly and highly customized APIs to calculate various features of biological molecules and complex interaction samples conveniently, which aims at building integrated analysis pipelines from data acquisition, data checking, and descriptor calculation to modeling. PyBioMed is freely available at http://projects.scbdd.com/pybiomed.html. [Image: see text] |
format | Online Article Text |
id | pubmed-5861255 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Springer International Publishing |
record_format | MEDLINE/PubMed |
spelling | pubmed-58612552018-03-23 PyBioMed: a python library for various molecular representations of chemicals, proteins and DNAs and their interactions Dong, Jie Yao, Zhi-Jiang Zhang, Lin Luo, Feijun Lin, Qinlu Lu, Ai-Ping Chen, Alex F. Cao, Dong-Sheng J Cheminform Software BACKGROUND: With the increasing development of biotechnology and informatics technology, publicly available data in chemistry and biology are undergoing explosive growth. Such wealthy information in these data needs to be extracted and transformed to useful knowledge by various data mining methods. Considering the amazing rate at which data are accumulated in chemistry and biology fields, new tools that process and interpret large and complex interaction data are increasingly important. So far, there are no suitable toolkits that can effectively link the chemical and biological space in view of molecular representation. To further explore these complex data, an integrated toolkit for various molecular representation is urgently needed which could be easily integrated with data mining algorithms to start a full data analysis pipeline. RESULTS: Herein, the python library PyBioMed is presented, which comprises functionalities for online download for various molecular objects by providing different IDs, the pretreatment of molecular structures, the computation of various molecular descriptors for chemicals, proteins, DNAs and their interactions. PyBioMed is a feature-rich and highly customized python library used for the characterization of various complex chemical and biological molecules and interaction samples. The current version of PyBioMed could calculate 775 chemical descriptors and 19 kinds of chemical fingerprints, 9920 protein descriptors based on protein sequences, more than 6000 DNA descriptors from nucleotide sequences, and interaction descriptors from pairwise samples using three different combining strategies. Several examples and five real-life applications were provided to clearly guide the users how to use PyBioMed as an integral part of data analysis projects. By using PyBioMed, users are able to start a full pipelining from getting molecular data, pretreating molecules, molecular representation to constructing machine learning models conveniently. CONCLUSION: PyBioMed provides various user-friendly and highly customized APIs to calculate various features of biological molecules and complex interaction samples conveniently, which aims at building integrated analysis pipelines from data acquisition, data checking, and descriptor calculation to modeling. PyBioMed is freely available at http://projects.scbdd.com/pybiomed.html. [Image: see text] Springer International Publishing 2018-03-20 /pmc/articles/PMC5861255/ /pubmed/29556758 http://dx.doi.org/10.1186/s13321-018-0270-2 Text en © The Author(s) 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Software Dong, Jie Yao, Zhi-Jiang Zhang, Lin Luo, Feijun Lin, Qinlu Lu, Ai-Ping Chen, Alex F. Cao, Dong-Sheng PyBioMed: a python library for various molecular representations of chemicals, proteins and DNAs and their interactions |
title | PyBioMed: a python library for various molecular representations of chemicals, proteins and DNAs and their interactions |
title_full | PyBioMed: a python library for various molecular representations of chemicals, proteins and DNAs and their interactions |
title_fullStr | PyBioMed: a python library for various molecular representations of chemicals, proteins and DNAs and their interactions |
title_full_unstemmed | PyBioMed: a python library for various molecular representations of chemicals, proteins and DNAs and their interactions |
title_short | PyBioMed: a python library for various molecular representations of chemicals, proteins and DNAs and their interactions |
title_sort | pybiomed: a python library for various molecular representations of chemicals, proteins and dnas and their interactions |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5861255/ https://www.ncbi.nlm.nih.gov/pubmed/29556758 http://dx.doi.org/10.1186/s13321-018-0270-2 |
work_keys_str_mv | AT dongjie pybiomedapythonlibraryforvariousmolecularrepresentationsofchemicalsproteinsanddnasandtheirinteractions AT yaozhijiang pybiomedapythonlibraryforvariousmolecularrepresentationsofchemicalsproteinsanddnasandtheirinteractions AT zhanglin pybiomedapythonlibraryforvariousmolecularrepresentationsofchemicalsproteinsanddnasandtheirinteractions AT luofeijun pybiomedapythonlibraryforvariousmolecularrepresentationsofchemicalsproteinsanddnasandtheirinteractions AT linqinlu pybiomedapythonlibraryforvariousmolecularrepresentationsofchemicalsproteinsanddnasandtheirinteractions AT luaiping pybiomedapythonlibraryforvariousmolecularrepresentationsofchemicalsproteinsanddnasandtheirinteractions AT chenalexf pybiomedapythonlibraryforvariousmolecularrepresentationsofchemicalsproteinsanddnasandtheirinteractions AT caodongsheng pybiomedapythonlibraryforvariousmolecularrepresentationsofchemicalsproteinsanddnasandtheirinteractions |