Cargando…

pyResearchInsights—An open‐source Python package for scientific text analysis

1. With an increasing number of scientific articles published each year, there is a need to synthesize and obtain insights across ever‐growing volumes of literature. Here, we present pyResearchInsights, a novel open‐source automated content analysis package that can be used to analyze scientific abs...

Descripción completa

Detalles Bibliográficos
Autores principales: Shetty, Sarthak J., Ramesh, Vijay
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8525079/
https://www.ncbi.nlm.nih.gov/pubmed/34707828
http://dx.doi.org/10.1002/ece3.8098
_version_ 1784585617159487488
author Shetty, Sarthak J.
Ramesh, Vijay
author_facet Shetty, Sarthak J.
Ramesh, Vijay
author_sort Shetty, Sarthak J.
collection PubMed
description 1. With an increasing number of scientific articles published each year, there is a need to synthesize and obtain insights across ever‐growing volumes of literature. Here, we present pyResearchInsights, a novel open‐source automated content analysis package that can be used to analyze scientific abstracts within a natural language processing framework. 2. The package collects abstracts from scientific repositories, identifies topics of research discussed in these abstracts, and presents interactive concept maps to visualize these research topics. To showcase the utilities of this package, we present two examples, specific to the field of ecology and conservation biology. 3. First, we demonstrate the end‐to‐end functionality of the package by presenting topics of research discussed in 1,131 abstracts pertaining to birds of the Tropical Andes. Our results suggest that a large proportion of avian research in this biodiversity hotspot pertains to species distributions, climate change, and plant ecology. 4. Second, we retrieved and analyzed 22,561 abstracts across eight journals in the field of conservation biology to identify twelve global topics of conservation research. Our analysis shows that conservation policy and landscape ecology are focal topics of research. We further examined how these conservation‐associated research topics varied across five biodiversity hotspots. 5. Lastly, we compared the utilities of this package with existing tools that carry out automated content analysis, and we show that our open‐source package has wider functionality and provides end‐to‐end utilities that seldom exist across other tools.
format Online
Article
Text
id pubmed-8525079
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-85250792021-10-26 pyResearchInsights—An open‐source Python package for scientific text analysis Shetty, Sarthak J. Ramesh, Vijay Ecol Evol Original Research 1. With an increasing number of scientific articles published each year, there is a need to synthesize and obtain insights across ever‐growing volumes of literature. Here, we present pyResearchInsights, a novel open‐source automated content analysis package that can be used to analyze scientific abstracts within a natural language processing framework. 2. The package collects abstracts from scientific repositories, identifies topics of research discussed in these abstracts, and presents interactive concept maps to visualize these research topics. To showcase the utilities of this package, we present two examples, specific to the field of ecology and conservation biology. 3. First, we demonstrate the end‐to‐end functionality of the package by presenting topics of research discussed in 1,131 abstracts pertaining to birds of the Tropical Andes. Our results suggest that a large proportion of avian research in this biodiversity hotspot pertains to species distributions, climate change, and plant ecology. 4. Second, we retrieved and analyzed 22,561 abstracts across eight journals in the field of conservation biology to identify twelve global topics of conservation research. Our analysis shows that conservation policy and landscape ecology are focal topics of research. We further examined how these conservation‐associated research topics varied across five biodiversity hotspots. 5. Lastly, we compared the utilities of this package with existing tools that carry out automated content analysis, and we show that our open‐source package has wider functionality and provides end‐to‐end utilities that seldom exist across other tools. John Wiley and Sons Inc. 2021-09-17 /pmc/articles/PMC8525079/ /pubmed/34707828 http://dx.doi.org/10.1002/ece3.8098 Text en © 2021 The Authors. Ecology and Evolution published by John Wiley & Sons Ltd. https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Research
Shetty, Sarthak J.
Ramesh, Vijay
pyResearchInsights—An open‐source Python package for scientific text analysis
title pyResearchInsights—An open‐source Python package for scientific text analysis
title_full pyResearchInsights—An open‐source Python package for scientific text analysis
title_fullStr pyResearchInsights—An open‐source Python package for scientific text analysis
title_full_unstemmed pyResearchInsights—An open‐source Python package for scientific text analysis
title_short pyResearchInsights—An open‐source Python package for scientific text analysis
title_sort pyresearchinsights—an open‐source python package for scientific text analysis
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8525079/
https://www.ncbi.nlm.nih.gov/pubmed/34707828
http://dx.doi.org/10.1002/ece3.8098
work_keys_str_mv AT shettysarthakj pyresearchinsightsanopensourcepythonpackageforscientifictextanalysis
AT rameshvijay pyresearchinsightsanopensourcepythonpackageforscientifictextanalysis