Cargando…

The Text-mining based PubChem Bioassay neighboring analysis

BACKGROUND: In recent years, the number of High Throughput Screening (HTS) assays deposited in PubChem has grown quickly. As a result, the volume of both the structured information (i.e. molecular structure, bioactivities) and the unstructured information (such as descriptions of bioassay experiment...

Descripción completa

Detalles Bibliográficos
Autores principales:	Han, Lianyi, Suzek, Tugba O, Wang, Yanli, Bryant, Steve H
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2010
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3098095/ https://www.ncbi.nlm.nih.gov/pubmed/21059237 http://dx.doi.org/10.1186/1471-2105-11-549

_version_	1782203916348293120
author	Han, Lianyi Suzek, Tugba O Wang, Yanli Bryant, Steve H
author_facet	Han, Lianyi Suzek, Tugba O Wang, Yanli Bryant, Steve H
author_sort	Han, Lianyi
collection	PubMed
description	BACKGROUND: In recent years, the number of High Throughput Screening (HTS) assays deposited in PubChem has grown quickly. As a result, the volume of both the structured information (i.e. molecular structure, bioactivities) and the unstructured information (such as descriptions of bioassay experiments), has been increasing exponentially. As a result, it has become even more demanding and challenging to efficiently assemble the bioactivity data by mining the huge amount of information to identify and interpret the relationships among the diversified bioassay experiments. In this work, we propose a text-mining based approach for bioassay neighboring analysis from the unstructured text descriptions contained in the PubChem BioAssay database. RESULTS: The neighboring analysis is achieved by evaluating the cosine scores of each bioassay pair and fraction of overlaps among the human-curated neighbors. Our results from the cosine score distribution analysis and assay neighbor clustering analysis on all PubChem bioassays suggest that strong correlations among the bioassays can be identified from their conceptual relevance. A comparison with other existing assay neighboring methods suggests that the text-mining based bioassay neighboring approach provides meaningful linkages among the PubChem bioassays, and complements the existing methods by identifying additional relationships among the bioassay entries. CONCLUSIONS: The text-mining based bioassay neighboring analysis is efficient for correlating bioassays and studying different aspects of a biological process, which are otherwise difficult to achieve by existing neighboring procedures due to the lack of specific annotations and structured information. It is suggested that the text-mining based bioassay neighboring analysis can be used as a standalone or as a complementary tool for the PubChem bioassay neighboring process to enable efficient integration of assay results and generate hypotheses for the discovery of bioactivities of the tested reagents.
format	Text
id	pubmed-3098095
institution	National Center for Biotechnology Information
language	English
publishDate	2010
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-30980952011-07-08 The Text-mining based PubChem Bioassay neighboring analysis Han, Lianyi Suzek, Tugba O Wang, Yanli Bryant, Steve H BMC Bioinformatics Research Article BACKGROUND: In recent years, the number of High Throughput Screening (HTS) assays deposited in PubChem has grown quickly. As a result, the volume of both the structured information (i.e. molecular structure, bioactivities) and the unstructured information (such as descriptions of bioassay experiments), has been increasing exponentially. As a result, it has become even more demanding and challenging to efficiently assemble the bioactivity data by mining the huge amount of information to identify and interpret the relationships among the diversified bioassay experiments. In this work, we propose a text-mining based approach for bioassay neighboring analysis from the unstructured text descriptions contained in the PubChem BioAssay database. RESULTS: The neighboring analysis is achieved by evaluating the cosine scores of each bioassay pair and fraction of overlaps among the human-curated neighbors. Our results from the cosine score distribution analysis and assay neighbor clustering analysis on all PubChem bioassays suggest that strong correlations among the bioassays can be identified from their conceptual relevance. A comparison with other existing assay neighboring methods suggests that the text-mining based bioassay neighboring approach provides meaningful linkages among the PubChem bioassays, and complements the existing methods by identifying additional relationships among the bioassay entries. CONCLUSIONS: The text-mining based bioassay neighboring analysis is efficient for correlating bioassays and studying different aspects of a biological process, which are otherwise difficult to achieve by existing neighboring procedures due to the lack of specific annotations and structured information. It is suggested that the text-mining based bioassay neighboring analysis can be used as a standalone or as a complementary tool for the PubChem bioassay neighboring process to enable efficient integration of assay results and generate hypotheses for the discovery of bioactivities of the tested reagents. BioMed Central 2010-11-08 /pmc/articles/PMC3098095/ /pubmed/21059237 http://dx.doi.org/10.1186/1471-2105-11-549 Text en Copyright ©2010 Han et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Han, Lianyi Suzek, Tugba O Wang, Yanli Bryant, Steve H The Text-mining based PubChem Bioassay neighboring analysis
title	The Text-mining based PubChem Bioassay neighboring analysis
title_full	The Text-mining based PubChem Bioassay neighboring analysis
title_fullStr	The Text-mining based PubChem Bioassay neighboring analysis
title_full_unstemmed	The Text-mining based PubChem Bioassay neighboring analysis
title_short	The Text-mining based PubChem Bioassay neighboring analysis
title_sort	text-mining based pubchem bioassay neighboring analysis
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3098095/ https://www.ncbi.nlm.nih.gov/pubmed/21059237 http://dx.doi.org/10.1186/1471-2105-11-549
work_keys_str_mv	AT hanlianyi thetextminingbasedpubchembioassayneighboringanalysis AT suzektugbao thetextminingbasedpubchembioassayneighboringanalysis AT wangyanli thetextminingbasedpubchembioassayneighboringanalysis AT bryantsteveh thetextminingbasedpubchembioassayneighboringanalysis AT hanlianyi textminingbasedpubchembioassayneighboringanalysis AT suzektugbao textminingbasedpubchembioassayneighboringanalysis AT wangyanli textminingbasedpubchembioassayneighboringanalysis AT bryantsteveh textminingbasedpubchembioassayneighboringanalysis

The Text-mining based PubChem Bioassay neighboring analysis

Ejemplares similares