Cargando…

A bioinformatics knowledge discovery in text application for grid computing

BACKGROUND: A fundamental activity in biomedical research is Knowledge Discovery which has the ability to search through large amounts of biomedical information such as documents and data. High performance computational infrastructures, such as Grid technologies, are emerging as a possible infrastru...

Descripción completa

Detalles Bibliográficos
Autores principales:	Castellano, Marcello, Mastronardi, Giuseppe, Bellotti, Roberto, Tarricone, Gianfranco
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2009
Materias:	Proceedings
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2697647/ https://www.ncbi.nlm.nih.gov/pubmed/19534749 http://dx.doi.org/10.1186/1471-2105-10-S6-S23

_version_	1782168347598651392
author	Castellano, Marcello Mastronardi, Giuseppe Bellotti, Roberto Tarricone, Gianfranco
author_facet	Castellano, Marcello Mastronardi, Giuseppe Bellotti, Roberto Tarricone, Gianfranco
author_sort	Castellano, Marcello
collection	PubMed
description	BACKGROUND: A fundamental activity in biomedical research is Knowledge Discovery which has the ability to search through large amounts of biomedical information such as documents and data. High performance computational infrastructures, such as Grid technologies, are emerging as a possible infrastructure to tackle the intensive use of Information and Communication resources in life science. The goal of this work was to develop a software middleware solution in order to exploit the many knowledge discovery applications on scalable and distributed computing systems to achieve intensive use of ICT resources. METHODS: The development of a grid application for Knowledge Discovery in Text using a middleware solution based methodology is presented. The system must be able to: perform a user application model, process the jobs with the aim of creating many parallel jobs to distribute on the computational nodes. Finally, the system must be aware of the computational resources available, their status and must be able to monitor the execution of parallel jobs. These operative requirements lead to design a middleware to be specialized using user application modules. It included a graphical user interface in order to access to a node search system, a load balancing system and a transfer optimizer to reduce communication costs. RESULTS: A middleware solution prototype and the performance evaluation of it in terms of the speed-up factor is shown. It was written in JAVA on Globus Toolkit 4 to build the grid infrastructure based on GNU/Linux computer grid nodes. A test was carried out and the results are shown for the named entity recognition search of symptoms and pathologies. The search was applied to a collection of 5,000 scientific documents taken from PubMed. CONCLUSION: In this paper we discuss the development of a grid application based on a middleware solution. It has been tested on a knowledge discovery in text process to extract new and useful information about symptoms and pathologies from a large collection of unstructured scientific documents. As an example a computation of Knowledge Discovery in Database was applied on the output produced by the KDT user module to extract new knowledge about symptom and pathology bio-entities.
format	Text
id	pubmed-2697647
institution	National Center for Biotechnology Information
language	English
publishDate	2009
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-26976472009-06-16 A bioinformatics knowledge discovery in text application for grid computing Castellano, Marcello Mastronardi, Giuseppe Bellotti, Roberto Tarricone, Gianfranco BMC Bioinformatics Proceedings BACKGROUND: A fundamental activity in biomedical research is Knowledge Discovery which has the ability to search through large amounts of biomedical information such as documents and data. High performance computational infrastructures, such as Grid technologies, are emerging as a possible infrastructure to tackle the intensive use of Information and Communication resources in life science. The goal of this work was to develop a software middleware solution in order to exploit the many knowledge discovery applications on scalable and distributed computing systems to achieve intensive use of ICT resources. METHODS: The development of a grid application for Knowledge Discovery in Text using a middleware solution based methodology is presented. The system must be able to: perform a user application model, process the jobs with the aim of creating many parallel jobs to distribute on the computational nodes. Finally, the system must be aware of the computational resources available, their status and must be able to monitor the execution of parallel jobs. These operative requirements lead to design a middleware to be specialized using user application modules. It included a graphical user interface in order to access to a node search system, a load balancing system and a transfer optimizer to reduce communication costs. RESULTS: A middleware solution prototype and the performance evaluation of it in terms of the speed-up factor is shown. It was written in JAVA on Globus Toolkit 4 to build the grid infrastructure based on GNU/Linux computer grid nodes. A test was carried out and the results are shown for the named entity recognition search of symptoms and pathologies. The search was applied to a collection of 5,000 scientific documents taken from PubMed. CONCLUSION: In this paper we discuss the development of a grid application based on a middleware solution. It has been tested on a knowledge discovery in text process to extract new and useful information about symptoms and pathologies from a large collection of unstructured scientific documents. As an example a computation of Knowledge Discovery in Database was applied on the output produced by the KDT user module to extract new knowledge about symptom and pathology bio-entities. BioMed Central 2009-06-16 /pmc/articles/PMC2697647/ /pubmed/19534749 http://dx.doi.org/10.1186/1471-2105-10-S6-S23 Text en Copyright © 2009 Castellano et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Proceedings Castellano, Marcello Mastronardi, Giuseppe Bellotti, Roberto Tarricone, Gianfranco A bioinformatics knowledge discovery in text application for grid computing
title	A bioinformatics knowledge discovery in text application for grid computing
title_full	A bioinformatics knowledge discovery in text application for grid computing
title_fullStr	A bioinformatics knowledge discovery in text application for grid computing
title_full_unstemmed	A bioinformatics knowledge discovery in text application for grid computing
title_short	A bioinformatics knowledge discovery in text application for grid computing
title_sort	bioinformatics knowledge discovery in text application for grid computing
topic	Proceedings
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2697647/ https://www.ncbi.nlm.nih.gov/pubmed/19534749 http://dx.doi.org/10.1186/1471-2105-10-S6-S23
work_keys_str_mv	AT castellanomarcello abioinformaticsknowledgediscoveryintextapplicationforgridcomputing AT mastronardigiuseppe abioinformaticsknowledgediscoveryintextapplicationforgridcomputing AT bellottiroberto abioinformaticsknowledgediscoveryintextapplicationforgridcomputing AT tarriconegianfranco abioinformaticsknowledgediscoveryintextapplicationforgridcomputing AT castellanomarcello bioinformaticsknowledgediscoveryintextapplicationforgridcomputing AT mastronardigiuseppe bioinformaticsknowledgediscoveryintextapplicationforgridcomputing AT bellottiroberto bioinformaticsknowledgediscoveryintextapplicationforgridcomputing AT tarriconegianfranco bioinformaticsknowledgediscoveryintextapplicationforgridcomputing

A bioinformatics knowledge discovery in text application for grid computing

Ejemplares similares