Cargando…

Data driven identification of international cutting edge science and technologies using SpaCy

Difficulties in collecting, processing, and identifying massive data have slowed research on cutting-edge science and technology hotspots. Promoting these technologies will not be successful without an effective data-driven method to identify cutting-edge technologies. This paper proposes a data-dri...

Descripción completa

Detalles Bibliográficos
Autores principales: Hu, Chunqi, Gong, Huaping, He, Yiqing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9555621/
https://www.ncbi.nlm.nih.gov/pubmed/36223420
http://dx.doi.org/10.1371/journal.pone.0275872
Descripción
Sumario:Difficulties in collecting, processing, and identifying massive data have slowed research on cutting-edge science and technology hotspots. Promoting these technologies will not be successful without an effective data-driven method to identify cutting-edge technologies. This paper proposes a data-driven model for identifying global cutting-edge science technologies based on SpaCy. In this model, we collected data released by 17 well-known American technology media websites from July 2019 to July 2020 using web crawling with Python. We combine graph-based neural network learning with active learning as the research method in this paper. Next, we introduced a ten-fold cross-check to train the model through machine learning with repeated experiments. The experimental results show that this model performed very well in entity recognition tasks with an F value of 98.11%. The model provides an information source for cutting-edge technology identification. It can promote innovations in cutting-edge technologies through its effective identification and tracking and explore more efficient scientific and technological research work modes.