Cargando…

Automated knowledge extraction from polymer literature using natural language processing

Materials science literature has grown exponentially in recent years making it difficult for individuals to master all of this information. This constrains the formulation of new hypotheses that scientists can come up with. In this work, we explore whether materials science knowledge can be automati...

Descripción completa

Detalles Bibliográficos
Autores principales: Shetty, Pranav, Ramprasad, Rampi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7797509/
https://www.ncbi.nlm.nih.gov/pubmed/33458607
http://dx.doi.org/10.1016/j.isci.2020.101922
_version_ 1783634882959769600
author Shetty, Pranav
Ramprasad, Rampi
author_facet Shetty, Pranav
Ramprasad, Rampi
author_sort Shetty, Pranav
collection PubMed
description Materials science literature has grown exponentially in recent years making it difficult for individuals to master all of this information. This constrains the formulation of new hypotheses that scientists can come up with. In this work, we explore whether materials science knowledge can be automatically inferred from textual information contained in journal papers. Using a data set of 0.5 million polymer papers, we show, using natural language processing methods that vector representations trained for every word in our corpus can indeed capture this knowledge in a completely unsupervised manner. We perform time-based studies through which we track popularity of various polymers for different applications and predict new polymers for novel applications based solely on the domain knowledge contained in our data set. Using co-relations detected automatically from literature in this manner thus, opens up a new paradigm for materials discovery.
format Online
Article
Text
id pubmed-7797509
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-77975092021-01-15 Automated knowledge extraction from polymer literature using natural language processing Shetty, Pranav Ramprasad, Rampi iScience Article Materials science literature has grown exponentially in recent years making it difficult for individuals to master all of this information. This constrains the formulation of new hypotheses that scientists can come up with. In this work, we explore whether materials science knowledge can be automatically inferred from textual information contained in journal papers. Using a data set of 0.5 million polymer papers, we show, using natural language processing methods that vector representations trained for every word in our corpus can indeed capture this knowledge in a completely unsupervised manner. We perform time-based studies through which we track popularity of various polymers for different applications and predict new polymers for novel applications based solely on the domain knowledge contained in our data set. Using co-relations detected automatically from literature in this manner thus, opens up a new paradigm for materials discovery. Elsevier 2020-12-10 /pmc/articles/PMC7797509/ /pubmed/33458607 http://dx.doi.org/10.1016/j.isci.2020.101922 Text en © 2020. http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Article
Shetty, Pranav
Ramprasad, Rampi
Automated knowledge extraction from polymer literature using natural language processing
title Automated knowledge extraction from polymer literature using natural language processing
title_full Automated knowledge extraction from polymer literature using natural language processing
title_fullStr Automated knowledge extraction from polymer literature using natural language processing
title_full_unstemmed Automated knowledge extraction from polymer literature using natural language processing
title_short Automated knowledge extraction from polymer literature using natural language processing
title_sort automated knowledge extraction from polymer literature using natural language processing
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7797509/
https://www.ncbi.nlm.nih.gov/pubmed/33458607
http://dx.doi.org/10.1016/j.isci.2020.101922
work_keys_str_mv AT shettypranav automatedknowledgeextractionfrompolymerliteratureusingnaturallanguageprocessing
AT ramprasadrampi automatedknowledgeextractionfrompolymerliteratureusingnaturallanguageprocessing