Cargando…

Opportunities and challenges of text mining in aterials research

Research publications are the major repository of scientific knowledge. However, their unstructured and highly heterogenous format creates a significant obstacle to large-scale analysis of the information contained within. Recent progress in natural language processing (NLP) has provided a variety o...

Descripción completa

Detalles Bibliográficos
Autores principales: Kononova, Olga, He, Tanjin, Huo, Haoyan, Trewartha, Amalie, Olivetti, Elsa A., Ceder, Gerbrand
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7905448/
https://www.ncbi.nlm.nih.gov/pubmed/33665573
http://dx.doi.org/10.1016/j.isci.2021.102155
_version_ 1783655113051602944
author Kononova, Olga
He, Tanjin
Huo, Haoyan
Trewartha, Amalie
Olivetti, Elsa A.
Ceder, Gerbrand
author_facet Kononova, Olga
He, Tanjin
Huo, Haoyan
Trewartha, Amalie
Olivetti, Elsa A.
Ceder, Gerbrand
author_sort Kononova, Olga
collection PubMed
description Research publications are the major repository of scientific knowledge. However, their unstructured and highly heterogenous format creates a significant obstacle to large-scale analysis of the information contained within. Recent progress in natural language processing (NLP) has provided a variety of tools for high-quality information extraction from unstructured text. These tools are primarily trained on non-technical text and struggle to produce accurate results when applied to scientific text, involving specific technical terminology. During the last years, significant efforts in information retrieval have been made for biomedical and biochemical publications. For materials science, text mining (TM) methodology is still at the dawn of its development. In this review, we survey the recent progress in creating and applying TM and NLP approaches to materials science field. This review is directed at the broad class of researchers aiming to learn the fundamentals of TM as applied to the materials science publications.
format Online
Article
Text
id pubmed-7905448
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-79054482021-03-03 Opportunities and challenges of text mining in aterials research Kononova, Olga He, Tanjin Huo, Haoyan Trewartha, Amalie Olivetti, Elsa A. Ceder, Gerbrand iScience Review Research publications are the major repository of scientific knowledge. However, their unstructured and highly heterogenous format creates a significant obstacle to large-scale analysis of the information contained within. Recent progress in natural language processing (NLP) has provided a variety of tools for high-quality information extraction from unstructured text. These tools are primarily trained on non-technical text and struggle to produce accurate results when applied to scientific text, involving specific technical terminology. During the last years, significant efforts in information retrieval have been made for biomedical and biochemical publications. For materials science, text mining (TM) methodology is still at the dawn of its development. In this review, we survey the recent progress in creating and applying TM and NLP approaches to materials science field. This review is directed at the broad class of researchers aiming to learn the fundamentals of TM as applied to the materials science publications. Elsevier 2021-02-06 /pmc/articles/PMC7905448/ /pubmed/33665573 http://dx.doi.org/10.1016/j.isci.2021.102155 Text en © 2021 The Author(s) http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Review
Kononova, Olga
He, Tanjin
Huo, Haoyan
Trewartha, Amalie
Olivetti, Elsa A.
Ceder, Gerbrand
Opportunities and challenges of text mining in aterials research
title Opportunities and challenges of text mining in aterials research
title_full Opportunities and challenges of text mining in aterials research
title_fullStr Opportunities and challenges of text mining in aterials research
title_full_unstemmed Opportunities and challenges of text mining in aterials research
title_short Opportunities and challenges of text mining in aterials research
title_sort opportunities and challenges of text mining in aterials research
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7905448/
https://www.ncbi.nlm.nih.gov/pubmed/33665573
http://dx.doi.org/10.1016/j.isci.2021.102155
work_keys_str_mv AT kononovaolga opportunitiesandchallengesoftextmininginaterialsresearch
AT hetanjin opportunitiesandchallengesoftextmininginaterialsresearch
AT huohaoyan opportunitiesandchallengesoftextmininginaterialsresearch
AT trewarthaamalie opportunitiesandchallengesoftextmininginaterialsresearch
AT olivettielsaa opportunitiesandchallengesoftextmininginaterialsresearch
AT cedergerbrand opportunitiesandchallengesoftextmininginaterialsresearch