Cargando…

Data driven identification of international cutting edge science and technologies using SpaCy

Difficulties in collecting, processing, and identifying massive data have slowed research on cutting-edge science and technology hotspots. Promoting these technologies will not be successful without an effective data-driven method to identify cutting-edge technologies. This paper proposes a data-dri...

Descripción completa

Detalles Bibliográficos
Autores principales: Hu, Chunqi, Gong, Huaping, He, Yiqing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9555621/
https://www.ncbi.nlm.nih.gov/pubmed/36223420
http://dx.doi.org/10.1371/journal.pone.0275872
_version_ 1784806898907742208
author Hu, Chunqi
Gong, Huaping
He, Yiqing
author_facet Hu, Chunqi
Gong, Huaping
He, Yiqing
author_sort Hu, Chunqi
collection PubMed
description Difficulties in collecting, processing, and identifying massive data have slowed research on cutting-edge science and technology hotspots. Promoting these technologies will not be successful without an effective data-driven method to identify cutting-edge technologies. This paper proposes a data-driven model for identifying global cutting-edge science technologies based on SpaCy. In this model, we collected data released by 17 well-known American technology media websites from July 2019 to July 2020 using web crawling with Python. We combine graph-based neural network learning with active learning as the research method in this paper. Next, we introduced a ten-fold cross-check to train the model through machine learning with repeated experiments. The experimental results show that this model performed very well in entity recognition tasks with an F value of 98.11%. The model provides an information source for cutting-edge technology identification. It can promote innovations in cutting-edge technologies through its effective identification and tracking and explore more efficient scientific and technological research work modes.
format Online
Article
Text
id pubmed-9555621
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-95556212022-10-13 Data driven identification of international cutting edge science and technologies using SpaCy Hu, Chunqi Gong, Huaping He, Yiqing PLoS One Research Article Difficulties in collecting, processing, and identifying massive data have slowed research on cutting-edge science and technology hotspots. Promoting these technologies will not be successful without an effective data-driven method to identify cutting-edge technologies. This paper proposes a data-driven model for identifying global cutting-edge science technologies based on SpaCy. In this model, we collected data released by 17 well-known American technology media websites from July 2019 to July 2020 using web crawling with Python. We combine graph-based neural network learning with active learning as the research method in this paper. Next, we introduced a ten-fold cross-check to train the model through machine learning with repeated experiments. The experimental results show that this model performed very well in entity recognition tasks with an F value of 98.11%. The model provides an information source for cutting-edge technology identification. It can promote innovations in cutting-edge technologies through its effective identification and tracking and explore more efficient scientific and technological research work modes. Public Library of Science 2022-10-12 /pmc/articles/PMC9555621/ /pubmed/36223420 http://dx.doi.org/10.1371/journal.pone.0275872 Text en © 2022 Hu et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Hu, Chunqi
Gong, Huaping
He, Yiqing
Data driven identification of international cutting edge science and technologies using SpaCy
title Data driven identification of international cutting edge science and technologies using SpaCy
title_full Data driven identification of international cutting edge science and technologies using SpaCy
title_fullStr Data driven identification of international cutting edge science and technologies using SpaCy
title_full_unstemmed Data driven identification of international cutting edge science and technologies using SpaCy
title_short Data driven identification of international cutting edge science and technologies using SpaCy
title_sort data driven identification of international cutting edge science and technologies using spacy
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9555621/
https://www.ncbi.nlm.nih.gov/pubmed/36223420
http://dx.doi.org/10.1371/journal.pone.0275872
work_keys_str_mv AT huchunqi datadrivenidentificationofinternationalcuttingedgescienceandtechnologiesusingspacy
AT gonghuaping datadrivenidentificationofinternationalcuttingedgescienceandtechnologiesusingspacy
AT heyiqing datadrivenidentificationofinternationalcuttingedgescienceandtechnologiesusingspacy