Cargando…

A Grid-based solution for management and analysis of microarrays in distributed experiments

Several systems have been presented in the last years in order to manage the complexity of large microarray experiments. Although good results have been achieved, most systems tend to lack in one or more fields. A Grid based approach may provide a shared, standardized and reliable solution for stora...

Descripción completa

Detalles Bibliográficos
Autores principales: Porro, Ivan, Torterolo, Livia, Corradi, Luca, Fato, Marco, Papadimitropoulos, Adam, Scaglione, Silvia, Schenone, Andrea, Viti, Federica
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1885859/
https://www.ncbi.nlm.nih.gov/pubmed/17430574
http://dx.doi.org/10.1186/1471-2105-8-S1-S7
_version_ 1782133658004488192
author Porro, Ivan
Torterolo, Livia
Corradi, Luca
Fato, Marco
Papadimitropoulos, Adam
Scaglione, Silvia
Schenone, Andrea
Viti, Federica
author_facet Porro, Ivan
Torterolo, Livia
Corradi, Luca
Fato, Marco
Papadimitropoulos, Adam
Scaglione, Silvia
Schenone, Andrea
Viti, Federica
author_sort Porro, Ivan
collection PubMed
description Several systems have been presented in the last years in order to manage the complexity of large microarray experiments. Although good results have been achieved, most systems tend to lack in one or more fields. A Grid based approach may provide a shared, standardized and reliable solution for storage and analysis of biological data, in order to maximize the results of experimental efforts. A Grid framework has been therefore adopted due to the necessity of remotely accessing large amounts of distributed data as well as to scale computational performances for terabyte datasets. Two different biological studies have been planned in order to highlight the benefits that can emerge from our Grid based platform. The described environment relies on storage services and computational services provided by the gLite Grid middleware. The Grid environment is also able to exploit the added value of metadata in order to let users better classify and search experiments. A state-of-art Grid portal has been implemented in order to hide the complexity of framework from end users and to make them able to easily access available services and data. The functional architecture of the portal is described. As a first test of the system performances, a gene expression analysis has been performed on a dataset of Affymetrix GeneChip(® )Rat Expression Array RAE230A, from the ArrayExpress database. The sequence of analysis includes three steps: (i) group opening and image set uploading, (ii) normalization, and (iii) model based gene expression (based on PM/MM difference model). Two different Linux versions (sequential and parallel) of the dChip software have been developed to implement the analysis and have been tested on a cluster. From results, it emerges that the parallelization of the analysis process and the execution of parallel jobs on distributed computational resources actually improve the performances. Moreover, the Grid environment have been tested both against the possibility of uploading and accessing distributed datasets through the Grid middleware and against its ability in managing the execution of jobs on distributed computational resources. Results from the Grid test will be discussed in a further paper.
format Text
id pubmed-1885859
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-18858592007-06-05 A Grid-based solution for management and analysis of microarrays in distributed experiments Porro, Ivan Torterolo, Livia Corradi, Luca Fato, Marco Papadimitropoulos, Adam Scaglione, Silvia Schenone, Andrea Viti, Federica BMC Bioinformatics Research Several systems have been presented in the last years in order to manage the complexity of large microarray experiments. Although good results have been achieved, most systems tend to lack in one or more fields. A Grid based approach may provide a shared, standardized and reliable solution for storage and analysis of biological data, in order to maximize the results of experimental efforts. A Grid framework has been therefore adopted due to the necessity of remotely accessing large amounts of distributed data as well as to scale computational performances for terabyte datasets. Two different biological studies have been planned in order to highlight the benefits that can emerge from our Grid based platform. The described environment relies on storage services and computational services provided by the gLite Grid middleware. The Grid environment is also able to exploit the added value of metadata in order to let users better classify and search experiments. A state-of-art Grid portal has been implemented in order to hide the complexity of framework from end users and to make them able to easily access available services and data. The functional architecture of the portal is described. As a first test of the system performances, a gene expression analysis has been performed on a dataset of Affymetrix GeneChip(® )Rat Expression Array RAE230A, from the ArrayExpress database. The sequence of analysis includes three steps: (i) group opening and image set uploading, (ii) normalization, and (iii) model based gene expression (based on PM/MM difference model). Two different Linux versions (sequential and parallel) of the dChip software have been developed to implement the analysis and have been tested on a cluster. From results, it emerges that the parallelization of the analysis process and the execution of parallel jobs on distributed computational resources actually improve the performances. Moreover, the Grid environment have been tested both against the possibility of uploading and accessing distributed datasets through the Grid middleware and against its ability in managing the execution of jobs on distributed computational resources. Results from the Grid test will be discussed in a further paper. BioMed Central 2007-03-08 /pmc/articles/PMC1885859/ /pubmed/17430574 http://dx.doi.org/10.1186/1471-2105-8-S1-S7 Text en Copyright © 2007 Porro et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Porro, Ivan
Torterolo, Livia
Corradi, Luca
Fato, Marco
Papadimitropoulos, Adam
Scaglione, Silvia
Schenone, Andrea
Viti, Federica
A Grid-based solution for management and analysis of microarrays in distributed experiments
title A Grid-based solution for management and analysis of microarrays in distributed experiments
title_full A Grid-based solution for management and analysis of microarrays in distributed experiments
title_fullStr A Grid-based solution for management and analysis of microarrays in distributed experiments
title_full_unstemmed A Grid-based solution for management and analysis of microarrays in distributed experiments
title_short A Grid-based solution for management and analysis of microarrays in distributed experiments
title_sort grid-based solution for management and analysis of microarrays in distributed experiments
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1885859/
https://www.ncbi.nlm.nih.gov/pubmed/17430574
http://dx.doi.org/10.1186/1471-2105-8-S1-S7
work_keys_str_mv AT porroivan agridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments
AT torterololivia agridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments
AT corradiluca agridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments
AT fatomarco agridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments
AT papadimitropoulosadam agridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments
AT scaglionesilvia agridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments
AT schenoneandrea agridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments
AT vitifederica agridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments
AT porroivan gridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments
AT torterololivia gridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments
AT corradiluca gridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments
AT fatomarco gridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments
AT papadimitropoulosadam gridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments
AT scaglionesilvia gridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments
AT schenoneandrea gridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments
AT vitifederica gridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments