Cargando…

Transcription between human-readable synthetic descriptions and machine-executable instructions: an application of the latest pre-training technology

AI has been widely applied in scientific scenarios, such as robots performing chemical synthetic actions to free researchers from monotonous experimental procedures. However, there exists a gap between human-readable natural language descriptions and machine-executable instructions, of which the for...

Descripción completa

Detalles Bibliográficos
Autores principales: Zeng, Zheni, Nie, Yi-Chen, Ding, Ning, Ding, Qian-Jun, Ye, Wei-Ting, Yang, Cheng, Sun, Maosong, E, Weinan, Zhu, Rong, Liu, Zhiyuan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Royal Society of Chemistry 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10498500/
https://www.ncbi.nlm.nih.gov/pubmed/37712039
http://dx.doi.org/10.1039/d3sc02483k
_version_ 1785105532546187264
author Zeng, Zheni
Nie, Yi-Chen
Ding, Ning
Ding, Qian-Jun
Ye, Wei-Ting
Yang, Cheng
Sun, Maosong
E, Weinan
Zhu, Rong
Liu, Zhiyuan
author_facet Zeng, Zheni
Nie, Yi-Chen
Ding, Ning
Ding, Qian-Jun
Ye, Wei-Ting
Yang, Cheng
Sun, Maosong
E, Weinan
Zhu, Rong
Liu, Zhiyuan
author_sort Zeng, Zheni
collection PubMed
description AI has been widely applied in scientific scenarios, such as robots performing chemical synthetic actions to free researchers from monotonous experimental procedures. However, there exists a gap between human-readable natural language descriptions and machine-executable instructions, of which the former are typically in numerous chemical articles, and the latter are currently compiled manually by experts. We apply the latest technology of pre-trained models and achieve automatic transcription between descriptions and instructions. We design a concise and comprehensive schema of instructions and construct an open-source human-annotated dataset consisting of 3950 description–instruction pairs, with 9.2 operations in each instruction on average. We further propose knowledgeable pre-trained transcription models enhanced by multi-grained chemical knowledge. The performance of recent popular models and products showing great capability in automatic writing (e.g., ChatGPT) has also been explored. Experiments prove that our system improves the instruction compilation efficiency of researchers by at least 42%, and can generate fluent academic paragraphs of synthetic descriptions when given instructions, showing the great potential of pre-trained models in improving human productivity.
format Online
Article
Text
id pubmed-10498500
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher The Royal Society of Chemistry
record_format MEDLINE/PubMed
spelling pubmed-104985002023-09-14 Transcription between human-readable synthetic descriptions and machine-executable instructions: an application of the latest pre-training technology Zeng, Zheni Nie, Yi-Chen Ding, Ning Ding, Qian-Jun Ye, Wei-Ting Yang, Cheng Sun, Maosong E, Weinan Zhu, Rong Liu, Zhiyuan Chem Sci Chemistry AI has been widely applied in scientific scenarios, such as robots performing chemical synthetic actions to free researchers from monotonous experimental procedures. However, there exists a gap between human-readable natural language descriptions and machine-executable instructions, of which the former are typically in numerous chemical articles, and the latter are currently compiled manually by experts. We apply the latest technology of pre-trained models and achieve automatic transcription between descriptions and instructions. We design a concise and comprehensive schema of instructions and construct an open-source human-annotated dataset consisting of 3950 description–instruction pairs, with 9.2 operations in each instruction on average. We further propose knowledgeable pre-trained transcription models enhanced by multi-grained chemical knowledge. The performance of recent popular models and products showing great capability in automatic writing (e.g., ChatGPT) has also been explored. Experiments prove that our system improves the instruction compilation efficiency of researchers by at least 42%, and can generate fluent academic paragraphs of synthetic descriptions when given instructions, showing the great potential of pre-trained models in improving human productivity. The Royal Society of Chemistry 2023-08-24 /pmc/articles/PMC10498500/ /pubmed/37712039 http://dx.doi.org/10.1039/d3sc02483k Text en This journal is © The Royal Society of Chemistry https://creativecommons.org/licenses/by-nc/3.0/
spellingShingle Chemistry
Zeng, Zheni
Nie, Yi-Chen
Ding, Ning
Ding, Qian-Jun
Ye, Wei-Ting
Yang, Cheng
Sun, Maosong
E, Weinan
Zhu, Rong
Liu, Zhiyuan
Transcription between human-readable synthetic descriptions and machine-executable instructions: an application of the latest pre-training technology
title Transcription between human-readable synthetic descriptions and machine-executable instructions: an application of the latest pre-training technology
title_full Transcription between human-readable synthetic descriptions and machine-executable instructions: an application of the latest pre-training technology
title_fullStr Transcription between human-readable synthetic descriptions and machine-executable instructions: an application of the latest pre-training technology
title_full_unstemmed Transcription between human-readable synthetic descriptions and machine-executable instructions: an application of the latest pre-training technology
title_short Transcription between human-readable synthetic descriptions and machine-executable instructions: an application of the latest pre-training technology
title_sort transcription between human-readable synthetic descriptions and machine-executable instructions: an application of the latest pre-training technology
topic Chemistry
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10498500/
https://www.ncbi.nlm.nih.gov/pubmed/37712039
http://dx.doi.org/10.1039/d3sc02483k
work_keys_str_mv AT zengzheni transcriptionbetweenhumanreadablesyntheticdescriptionsandmachineexecutableinstructionsanapplicationofthelatestpretrainingtechnology
AT nieyichen transcriptionbetweenhumanreadablesyntheticdescriptionsandmachineexecutableinstructionsanapplicationofthelatestpretrainingtechnology
AT dingning transcriptionbetweenhumanreadablesyntheticdescriptionsandmachineexecutableinstructionsanapplicationofthelatestpretrainingtechnology
AT dingqianjun transcriptionbetweenhumanreadablesyntheticdescriptionsandmachineexecutableinstructionsanapplicationofthelatestpretrainingtechnology
AT yeweiting transcriptionbetweenhumanreadablesyntheticdescriptionsandmachineexecutableinstructionsanapplicationofthelatestpretrainingtechnology
AT yangcheng transcriptionbetweenhumanreadablesyntheticdescriptionsandmachineexecutableinstructionsanapplicationofthelatestpretrainingtechnology
AT sunmaosong transcriptionbetweenhumanreadablesyntheticdescriptionsandmachineexecutableinstructionsanapplicationofthelatestpretrainingtechnology
AT eweinan transcriptionbetweenhumanreadablesyntheticdescriptionsandmachineexecutableinstructionsanapplicationofthelatestpretrainingtechnology
AT zhurong transcriptionbetweenhumanreadablesyntheticdescriptionsandmachineexecutableinstructionsanapplicationofthelatestpretrainingtechnology
AT liuzhiyuan transcriptionbetweenhumanreadablesyntheticdescriptionsandmachineexecutableinstructionsanapplicationofthelatestpretrainingtechnology