Cargando…
RMol: a toolset for transforming SD/Molfile structure information into R objects
BACKGROUND: The graph-theoretical analysis of molecular networks has a long tradition in chemoinformatics. As demonstrated frequently, a well designed format to encode chemical structures and structure-related information of organic compounds is the Molfile format. But when it comes to use modern pr...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3599689/ https://www.ncbi.nlm.nih.gov/pubmed/23151338 http://dx.doi.org/10.1186/1751-0473-7-12 |
_version_ | 1782263027943342080 |
---|---|
author | Grabner, Martin Varmuza, Kurt Dehmer, Matthias |
author_facet | Grabner, Martin Varmuza, Kurt Dehmer, Matthias |
author_sort | Grabner, Martin |
collection | PubMed |
description | BACKGROUND: The graph-theoretical analysis of molecular networks has a long tradition in chemoinformatics. As demonstrated frequently, a well designed format to encode chemical structures and structure-related information of organic compounds is the Molfile format. But when it comes to use modern programming languages for statistical data analysis in Bio- and Chemoinformatics, R as one of the most powerful free languages lacks tools to process Molfile data collections and import molecular network data into R. RESULTS: We design an R object which allows a lossless information mapping of structural information from Molfiles into R objects. This provides the basis to use the RMol object as an anchor for connecting Molfile data collections with R libraries for analyzing graphs. Associated with the RMol objects, a set of R functions completes the toolset to organize, describe and manipulate the converted data sets. Further, we bypass R-typical limits for manipulating large data sets by storing R objects in bz-compressed serialized files instead of employing RData files. CONCLUSIONS: By design, RMol is a R toolset without dependencies to other libraries or programming languages. It is useful to integrate into pipelines for serialized batch analysis by using network data and, therefore, helps to process sdf-data sets in R efficiently. It is freely available under the BSD licence. The script source can be downloaded from http://sourceforge.net/p/rmol-toolset. |
format | Online Article Text |
id | pubmed-3599689 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-35996892013-03-17 RMol: a toolset for transforming SD/Molfile structure information into R objects Grabner, Martin Varmuza, Kurt Dehmer, Matthias Source Code Biol Med Brief Reports BACKGROUND: The graph-theoretical analysis of molecular networks has a long tradition in chemoinformatics. As demonstrated frequently, a well designed format to encode chemical structures and structure-related information of organic compounds is the Molfile format. But when it comes to use modern programming languages for statistical data analysis in Bio- and Chemoinformatics, R as one of the most powerful free languages lacks tools to process Molfile data collections and import molecular network data into R. RESULTS: We design an R object which allows a lossless information mapping of structural information from Molfiles into R objects. This provides the basis to use the RMol object as an anchor for connecting Molfile data collections with R libraries for analyzing graphs. Associated with the RMol objects, a set of R functions completes the toolset to organize, describe and manipulate the converted data sets. Further, we bypass R-typical limits for manipulating large data sets by storing R objects in bz-compressed serialized files instead of employing RData files. CONCLUSIONS: By design, RMol is a R toolset without dependencies to other libraries or programming languages. It is useful to integrate into pipelines for serialized batch analysis by using network data and, therefore, helps to process sdf-data sets in R efficiently. It is freely available under the BSD licence. The script source can be downloaded from http://sourceforge.net/p/rmol-toolset. BioMed Central 2012-11-14 /pmc/articles/PMC3599689/ /pubmed/23151338 http://dx.doi.org/10.1186/1751-0473-7-12 Text en Copyright ©2012 Grabner et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Brief Reports Grabner, Martin Varmuza, Kurt Dehmer, Matthias RMol: a toolset for transforming SD/Molfile structure information into R objects |
title | RMol: a toolset for transforming SD/Molfile structure information into R objects |
title_full | RMol: a toolset for transforming SD/Molfile structure information into R objects |
title_fullStr | RMol: a toolset for transforming SD/Molfile structure information into R objects |
title_full_unstemmed | RMol: a toolset for transforming SD/Molfile structure information into R objects |
title_short | RMol: a toolset for transforming SD/Molfile structure information into R objects |
title_sort | rmol: a toolset for transforming sd/molfile structure information into r objects |
topic | Brief Reports |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3599689/ https://www.ncbi.nlm.nih.gov/pubmed/23151338 http://dx.doi.org/10.1186/1751-0473-7-12 |
work_keys_str_mv | AT grabnermartin rmolatoolsetfortransformingsdmolfilestructureinformationintorobjects AT varmuzakurt rmolatoolsetfortransformingsdmolfilestructureinformationintorobjects AT dehmermatthias rmolatoolsetfortransformingsdmolfilestructureinformationintorobjects |