Cargando…

ChemScanner: extraction and re-use(ability) of chemical information from common scientific documents containing ChemDraw files

We developed ChemScanner, a software that can be used for the extraction of chemical information from ChemDraw binary (CDX) or ChemDraw XML-based (CDXML) files and to retrieve the ChemDraw scheme from DOC, DOCX or XML documents. This can facilitate the reuse of chemical information embedded into div...

Descripción completa

Detalles Bibliográficos
Autores principales: Nguyen, An, Huang, Yu-Chieh, Tremouilhac, Pierre, Jung, Nicole, Bräse, Stefan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6907231/
https://www.ncbi.nlm.nih.gov/pubmed/33431008
http://dx.doi.org/10.1186/s13321-019-0400-5
_version_ 1783478508788383744
author Nguyen, An
Huang, Yu-Chieh
Tremouilhac, Pierre
Jung, Nicole
Bräse, Stefan
author_facet Nguyen, An
Huang, Yu-Chieh
Tremouilhac, Pierre
Jung, Nicole
Bräse, Stefan
author_sort Nguyen, An
collection PubMed
description We developed ChemScanner, a software that can be used for the extraction of chemical information from ChemDraw binary (CDX) or ChemDraw XML-based (CDXML) files and to retrieve the ChemDraw scheme from DOC, DOCX or XML documents. This can facilitate the reuse of chemical information embedded into diverse documents used as standard storage and communication instrument in chemical sciences (e.g. for student’s theses, PhD theses, or publications). The extracted information is processed to reactions, molecules, as well as additional text and values and can be accessed via the ChemScanner UI. ChemScanner supports the export to Excel and CML, the direct import of the extracted data to the Open Source ELN Chemotion or the use via “copy and paste” of selected information. The software was designed with a focus on the processing of documents with embedded molecular structure information as CDX or CDXML as these are the most common file formats for chemical drawings. The project aims to support the chemists in their efforts to re-use chemistry research data by providing them missing tools for an automated assembly of reaction data.
format Online
Article
Text
id pubmed-6907231
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-69072312019-12-30 ChemScanner: extraction and re-use(ability) of chemical information from common scientific documents containing ChemDraw files Nguyen, An Huang, Yu-Chieh Tremouilhac, Pierre Jung, Nicole Bräse, Stefan J Cheminform Software We developed ChemScanner, a software that can be used for the extraction of chemical information from ChemDraw binary (CDX) or ChemDraw XML-based (CDXML) files and to retrieve the ChemDraw scheme from DOC, DOCX or XML documents. This can facilitate the reuse of chemical information embedded into diverse documents used as standard storage and communication instrument in chemical sciences (e.g. for student’s theses, PhD theses, or publications). The extracted information is processed to reactions, molecules, as well as additional text and values and can be accessed via the ChemScanner UI. ChemScanner supports the export to Excel and CML, the direct import of the extracted data to the Open Source ELN Chemotion or the use via “copy and paste” of selected information. The software was designed with a focus on the processing of documents with embedded molecular structure information as CDX or CDXML as these are the most common file formats for chemical drawings. The project aims to support the chemists in their efforts to re-use chemistry research data by providing them missing tools for an automated assembly of reaction data. Springer International Publishing 2019-12-11 /pmc/articles/PMC6907231/ /pubmed/33431008 http://dx.doi.org/10.1186/s13321-019-0400-5 Text en © The Author(s) 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Nguyen, An
Huang, Yu-Chieh
Tremouilhac, Pierre
Jung, Nicole
Bräse, Stefan
ChemScanner: extraction and re-use(ability) of chemical information from common scientific documents containing ChemDraw files
title ChemScanner: extraction and re-use(ability) of chemical information from common scientific documents containing ChemDraw files
title_full ChemScanner: extraction and re-use(ability) of chemical information from common scientific documents containing ChemDraw files
title_fullStr ChemScanner: extraction and re-use(ability) of chemical information from common scientific documents containing ChemDraw files
title_full_unstemmed ChemScanner: extraction and re-use(ability) of chemical information from common scientific documents containing ChemDraw files
title_short ChemScanner: extraction and re-use(ability) of chemical information from common scientific documents containing ChemDraw files
title_sort chemscanner: extraction and re-use(ability) of chemical information from common scientific documents containing chemdraw files
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6907231/
https://www.ncbi.nlm.nih.gov/pubmed/33431008
http://dx.doi.org/10.1186/s13321-019-0400-5
work_keys_str_mv AT nguyenan chemscannerextractionandreuseabilityofchemicalinformationfromcommonscientificdocumentscontainingchemdrawfiles
AT huangyuchieh chemscannerextractionandreuseabilityofchemicalinformationfromcommonscientificdocumentscontainingchemdrawfiles
AT tremouilhacpierre chemscannerextractionandreuseabilityofchemicalinformationfromcommonscientificdocumentscontainingchemdrawfiles
AT jungnicole chemscannerextractionandreuseabilityofchemicalinformationfromcommonscientificdocumentscontainingchemdrawfiles
AT brasestefan chemscannerextractionandreuseabilityofchemicalinformationfromcommonscientificdocumentscontainingchemdrawfiles