Cargando…
Looking through glass: Knowledge discovery from materials science literature using natural language processing
Most of the knowledge in materials science literature is in the form of unstructured data such as text and images. Here, we present a framework employing natural language processing, which automates text and image comprehension and precision knowledge extraction from inorganic glasses’ literature. T...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8276010/ https://www.ncbi.nlm.nih.gov/pubmed/34286304 http://dx.doi.org/10.1016/j.patter.2021.100290 |
_version_ | 1783721828645076992 |
---|---|
author | Venugopal, Vineeth Sahoo, Sourav Zaki, Mohd Agarwal, Manish Gosvami, Nitya Nand Krishnan, N. M. Anoop |
author_facet | Venugopal, Vineeth Sahoo, Sourav Zaki, Mohd Agarwal, Manish Gosvami, Nitya Nand Krishnan, N. M. Anoop |
author_sort | Venugopal, Vineeth |
collection | PubMed |
description | Most of the knowledge in materials science literature is in the form of unstructured data such as text and images. Here, we present a framework employing natural language processing, which automates text and image comprehension and precision knowledge extraction from inorganic glasses’ literature. The abstracts are automatically categorized using latent Dirichlet allocation (LDA) to classify and search semantically linked publications. Similarly, a comprehensive summary of images and plots is presented using the caption cluster plot (CCP), providing direct access to images buried in the papers. Finally, we combine the LDA and CCP with chemical elements to present an elemental map, a topical and image-wise distribution of elements occurring in the literature. Overall, the framework presented here can be a generic and powerful tool to extract and disseminate material-specific information on composition–structure–processing–property dataspaces, allowing insights into fundamental problems relevant to the materials science community and accelerated materials discovery. |
format | Online Article Text |
id | pubmed-8276010 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-82760102021-07-19 Looking through glass: Knowledge discovery from materials science literature using natural language processing Venugopal, Vineeth Sahoo, Sourav Zaki, Mohd Agarwal, Manish Gosvami, Nitya Nand Krishnan, N. M. Anoop Patterns (N Y) Article Most of the knowledge in materials science literature is in the form of unstructured data such as text and images. Here, we present a framework employing natural language processing, which automates text and image comprehension and precision knowledge extraction from inorganic glasses’ literature. The abstracts are automatically categorized using latent Dirichlet allocation (LDA) to classify and search semantically linked publications. Similarly, a comprehensive summary of images and plots is presented using the caption cluster plot (CCP), providing direct access to images buried in the papers. Finally, we combine the LDA and CCP with chemical elements to present an elemental map, a topical and image-wise distribution of elements occurring in the literature. Overall, the framework presented here can be a generic and powerful tool to extract and disseminate material-specific information on composition–structure–processing–property dataspaces, allowing insights into fundamental problems relevant to the materials science community and accelerated materials discovery. Elsevier 2021-06-24 /pmc/articles/PMC8276010/ /pubmed/34286304 http://dx.doi.org/10.1016/j.patter.2021.100290 Text en © 2021 The Authors https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Venugopal, Vineeth Sahoo, Sourav Zaki, Mohd Agarwal, Manish Gosvami, Nitya Nand Krishnan, N. M. Anoop Looking through glass: Knowledge discovery from materials science literature using natural language processing |
title | Looking through glass: Knowledge discovery from materials science literature using natural language processing |
title_full | Looking through glass: Knowledge discovery from materials science literature using natural language processing |
title_fullStr | Looking through glass: Knowledge discovery from materials science literature using natural language processing |
title_full_unstemmed | Looking through glass: Knowledge discovery from materials science literature using natural language processing |
title_short | Looking through glass: Knowledge discovery from materials science literature using natural language processing |
title_sort | looking through glass: knowledge discovery from materials science literature using natural language processing |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8276010/ https://www.ncbi.nlm.nih.gov/pubmed/34286304 http://dx.doi.org/10.1016/j.patter.2021.100290 |
work_keys_str_mv | AT venugopalvineeth lookingthroughglassknowledgediscoveryfrommaterialsscienceliteratureusingnaturallanguageprocessing AT sahoosourav lookingthroughglassknowledgediscoveryfrommaterialsscienceliteratureusingnaturallanguageprocessing AT zakimohd lookingthroughglassknowledgediscoveryfrommaterialsscienceliteratureusingnaturallanguageprocessing AT agarwalmanish lookingthroughglassknowledgediscoveryfrommaterialsscienceliteratureusingnaturallanguageprocessing AT gosvaminityanand lookingthroughglassknowledgediscoveryfrommaterialsscienceliteratureusingnaturallanguageprocessing AT krishnannmanoop lookingthroughglassknowledgediscoveryfrommaterialsscienceliteratureusingnaturallanguageprocessing |