Cargando…

Transcription between human-readable synthetic descriptions and machine-executable instructions: an application of the latest pre-training technology

AI has been widely applied in scientific scenarios, such as robots performing chemical synthetic actions to free researchers from monotonous experimental procedures. However, there exists a gap between human-readable natural language descriptions and machine-executable instructions, of which the for...

Descripción completa

Detalles Bibliográficos
Autores principales: Zeng, Zheni, Nie, Yi-Chen, Ding, Ning, Ding, Qian-Jun, Ye, Wei-Ting, Yang, Cheng, Sun, Maosong, E, Weinan, Zhu, Rong, Liu, Zhiyuan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Royal Society of Chemistry 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10498500/
https://www.ncbi.nlm.nih.gov/pubmed/37712039
http://dx.doi.org/10.1039/d3sc02483k
Descripción
Sumario:AI has been widely applied in scientific scenarios, such as robots performing chemical synthetic actions to free researchers from monotonous experimental procedures. However, there exists a gap between human-readable natural language descriptions and machine-executable instructions, of which the former are typically in numerous chemical articles, and the latter are currently compiled manually by experts. We apply the latest technology of pre-trained models and achieve automatic transcription between descriptions and instructions. We design a concise and comprehensive schema of instructions and construct an open-source human-annotated dataset consisting of 3950 description–instruction pairs, with 9.2 operations in each instruction on average. We further propose knowledgeable pre-trained transcription models enhanced by multi-grained chemical knowledge. The performance of recent popular models and products showing great capability in automatic writing (e.g., ChatGPT) has also been explored. Experiments prove that our system improves the instruction compilation efficiency of researchers by at least 42%, and can generate fluent academic paragraphs of synthetic descriptions when given instructions, showing the great potential of pre-trained models in improving human productivity.