Cargando…

Mining Proteome Research Reports: A Bird’s Eye View

The complexity of data has burgeoned to such an extent that scientists of every realm are encountering the incessant challenge of data management. Modern-day analytical approaches with the help of free source tools and programming languages have facilitated access to the context of the various domai...

Descripción completa

Detalles Bibliográficos
Autor principal: Sahu, Jagajjit
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8293458/
https://www.ncbi.nlm.nih.gov/pubmed/34200663
http://dx.doi.org/10.3390/proteomes9020029
_version_ 1783725042278858752
author Sahu, Jagajjit
author_facet Sahu, Jagajjit
author_sort Sahu, Jagajjit
collection PubMed
description The complexity of data has burgeoned to such an extent that scientists of every realm are encountering the incessant challenge of data management. Modern-day analytical approaches with the help of free source tools and programming languages have facilitated access to the context of the various domains as well as specific works reported. Here, with this article, an attempt has been made to provide a systematic analysis of all the available reports at PubMed on Proteome using text mining. The work is comprised of scientometrics as well as information extraction to provide the publication trends as well as frequent keywords, bioconcepts and most importantly gene–gene co-occurrence network. Out of 33,028 PMIDs collected initially, the segregation of 24,350 articles under 28 Medical Subject Headings (MeSH) was analyzed and plotted. Keyword link network and density visualizations were provided for the top 1000 frequent Mesh keywords. PubTator was used, and 322,026 bioconcepts were able to extracted under 10 classes (such as Gene, Disease, CellLine, etc.). Co-occurrence networks were constructed for PMID-bioconcept as well as bioconcept–bioconcept associations. Further, for creation of subnetwork with respect to gene–gene co-occurrence, a total of 11,100 unique genes participated with mTOR and AKT showing the highest (64) number of connections. The gene p53 was the most popular one in the network in accordance with both the degree and weighted degree centrality, which were 425 and 1414, respectively. The present piece of study is an amalgam of bibliometrics and scientific data mining methods looking deeper into the whole scale analysis of available literature on proteome.
format Online
Article
Text
id pubmed-8293458
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-82934582021-07-22 Mining Proteome Research Reports: A Bird’s Eye View Sahu, Jagajjit Proteomes Article The complexity of data has burgeoned to such an extent that scientists of every realm are encountering the incessant challenge of data management. Modern-day analytical approaches with the help of free source tools and programming languages have facilitated access to the context of the various domains as well as specific works reported. Here, with this article, an attempt has been made to provide a systematic analysis of all the available reports at PubMed on Proteome using text mining. The work is comprised of scientometrics as well as information extraction to provide the publication trends as well as frequent keywords, bioconcepts and most importantly gene–gene co-occurrence network. Out of 33,028 PMIDs collected initially, the segregation of 24,350 articles under 28 Medical Subject Headings (MeSH) was analyzed and plotted. Keyword link network and density visualizations were provided for the top 1000 frequent Mesh keywords. PubTator was used, and 322,026 bioconcepts were able to extracted under 10 classes (such as Gene, Disease, CellLine, etc.). Co-occurrence networks were constructed for PMID-bioconcept as well as bioconcept–bioconcept associations. Further, for creation of subnetwork with respect to gene–gene co-occurrence, a total of 11,100 unique genes participated with mTOR and AKT showing the highest (64) number of connections. The gene p53 was the most popular one in the network in accordance with both the degree and weighted degree centrality, which were 425 and 1414, respectively. The present piece of study is an amalgam of bibliometrics and scientific data mining methods looking deeper into the whole scale analysis of available literature on proteome. MDPI 2021-06-10 /pmc/articles/PMC8293458/ /pubmed/34200663 http://dx.doi.org/10.3390/proteomes9020029 Text en © 2021 by the author. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Sahu, Jagajjit
Mining Proteome Research Reports: A Bird’s Eye View
title Mining Proteome Research Reports: A Bird’s Eye View
title_full Mining Proteome Research Reports: A Bird’s Eye View
title_fullStr Mining Proteome Research Reports: A Bird’s Eye View
title_full_unstemmed Mining Proteome Research Reports: A Bird’s Eye View
title_short Mining Proteome Research Reports: A Bird’s Eye View
title_sort mining proteome research reports: a bird’s eye view
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8293458/
https://www.ncbi.nlm.nih.gov/pubmed/34200663
http://dx.doi.org/10.3390/proteomes9020029
work_keys_str_mv AT sahujagajjit miningproteomeresearchreportsabirdseyeview