Cargando…

RMol: a toolset for transforming SD/Molfile structure information into R objects

BACKGROUND: The graph-theoretical analysis of molecular networks has a long tradition in chemoinformatics. As demonstrated frequently, a well designed format to encode chemical structures and structure-related information of organic compounds is the Molfile format. But when it comes to use modern pr...

Descripción completa

Detalles Bibliográficos
Autores principales: Grabner, Martin, Varmuza, Kurt, Dehmer, Matthias
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3599689/
https://www.ncbi.nlm.nih.gov/pubmed/23151338
http://dx.doi.org/10.1186/1751-0473-7-12
_version_ 1782263027943342080
author Grabner, Martin
Varmuza, Kurt
Dehmer, Matthias
author_facet Grabner, Martin
Varmuza, Kurt
Dehmer, Matthias
author_sort Grabner, Martin
collection PubMed
description BACKGROUND: The graph-theoretical analysis of molecular networks has a long tradition in chemoinformatics. As demonstrated frequently, a well designed format to encode chemical structures and structure-related information of organic compounds is the Molfile format. But when it comes to use modern programming languages for statistical data analysis in Bio- and Chemoinformatics, R as one of the most powerful free languages lacks tools to process Molfile data collections and import molecular network data into R. RESULTS: We design an R object which allows a lossless information mapping of structural information from Molfiles into R objects. This provides the basis to use the RMol object as an anchor for connecting Molfile data collections with R libraries for analyzing graphs. Associated with the RMol objects, a set of R functions completes the toolset to organize, describe and manipulate the converted data sets. Further, we bypass R-typical limits for manipulating large data sets by storing R objects in bz-compressed serialized files instead of employing RData files. CONCLUSIONS: By design, RMol is a R toolset without dependencies to other libraries or programming languages. It is useful to integrate into pipelines for serialized batch analysis by using network data and, therefore, helps to process sdf-data sets in R efficiently. It is freely available under the BSD licence. The script source can be downloaded from http://sourceforge.net/p/rmol-toolset.
format Online
Article
Text
id pubmed-3599689
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35996892013-03-17 RMol: a toolset for transforming SD/Molfile structure information into R objects Grabner, Martin Varmuza, Kurt Dehmer, Matthias Source Code Biol Med Brief Reports BACKGROUND: The graph-theoretical analysis of molecular networks has a long tradition in chemoinformatics. As demonstrated frequently, a well designed format to encode chemical structures and structure-related information of organic compounds is the Molfile format. But when it comes to use modern programming languages for statistical data analysis in Bio- and Chemoinformatics, R as one of the most powerful free languages lacks tools to process Molfile data collections and import molecular network data into R. RESULTS: We design an R object which allows a lossless information mapping of structural information from Molfiles into R objects. This provides the basis to use the RMol object as an anchor for connecting Molfile data collections with R libraries for analyzing graphs. Associated with the RMol objects, a set of R functions completes the toolset to organize, describe and manipulate the converted data sets. Further, we bypass R-typical limits for manipulating large data sets by storing R objects in bz-compressed serialized files instead of employing RData files. CONCLUSIONS: By design, RMol is a R toolset without dependencies to other libraries or programming languages. It is useful to integrate into pipelines for serialized batch analysis by using network data and, therefore, helps to process sdf-data sets in R efficiently. It is freely available under the BSD licence. The script source can be downloaded from http://sourceforge.net/p/rmol-toolset. BioMed Central 2012-11-14 /pmc/articles/PMC3599689/ /pubmed/23151338 http://dx.doi.org/10.1186/1751-0473-7-12 Text en Copyright ©2012 Grabner et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Brief Reports
Grabner, Martin
Varmuza, Kurt
Dehmer, Matthias
RMol: a toolset for transforming SD/Molfile structure information into R objects
title RMol: a toolset for transforming SD/Molfile structure information into R objects
title_full RMol: a toolset for transforming SD/Molfile structure information into R objects
title_fullStr RMol: a toolset for transforming SD/Molfile structure information into R objects
title_full_unstemmed RMol: a toolset for transforming SD/Molfile structure information into R objects
title_short RMol: a toolset for transforming SD/Molfile structure information into R objects
title_sort rmol: a toolset for transforming sd/molfile structure information into r objects
topic Brief Reports
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3599689/
https://www.ncbi.nlm.nih.gov/pubmed/23151338
http://dx.doi.org/10.1186/1751-0473-7-12
work_keys_str_mv AT grabnermartin rmolatoolsetfortransformingsdmolfilestructureinformationintorobjects
AT varmuzakurt rmolatoolsetfortransformingsdmolfilestructureinformationintorobjects
AT dehmermatthias rmolatoolsetfortransformingsdmolfilestructureinformationintorobjects