Cargando…
A Grid-based solution for management and analysis of microarrays in distributed experiments
Several systems have been presented in the last years in order to manage the complexity of large microarray experiments. Although good results have been achieved, most systems tend to lack in one or more fields. A Grid based approach may provide a shared, standardized and reliable solution for stora...
Autores principales: | , , , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2007
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1885859/ https://www.ncbi.nlm.nih.gov/pubmed/17430574 http://dx.doi.org/10.1186/1471-2105-8-S1-S7 |
_version_ | 1782133658004488192 |
---|---|
author | Porro, Ivan Torterolo, Livia Corradi, Luca Fato, Marco Papadimitropoulos, Adam Scaglione, Silvia Schenone, Andrea Viti, Federica |
author_facet | Porro, Ivan Torterolo, Livia Corradi, Luca Fato, Marco Papadimitropoulos, Adam Scaglione, Silvia Schenone, Andrea Viti, Federica |
author_sort | Porro, Ivan |
collection | PubMed |
description | Several systems have been presented in the last years in order to manage the complexity of large microarray experiments. Although good results have been achieved, most systems tend to lack in one or more fields. A Grid based approach may provide a shared, standardized and reliable solution for storage and analysis of biological data, in order to maximize the results of experimental efforts. A Grid framework has been therefore adopted due to the necessity of remotely accessing large amounts of distributed data as well as to scale computational performances for terabyte datasets. Two different biological studies have been planned in order to highlight the benefits that can emerge from our Grid based platform. The described environment relies on storage services and computational services provided by the gLite Grid middleware. The Grid environment is also able to exploit the added value of metadata in order to let users better classify and search experiments. A state-of-art Grid portal has been implemented in order to hide the complexity of framework from end users and to make them able to easily access available services and data. The functional architecture of the portal is described. As a first test of the system performances, a gene expression analysis has been performed on a dataset of Affymetrix GeneChip(® )Rat Expression Array RAE230A, from the ArrayExpress database. The sequence of analysis includes three steps: (i) group opening and image set uploading, (ii) normalization, and (iii) model based gene expression (based on PM/MM difference model). Two different Linux versions (sequential and parallel) of the dChip software have been developed to implement the analysis and have been tested on a cluster. From results, it emerges that the parallelization of the analysis process and the execution of parallel jobs on distributed computational resources actually improve the performances. Moreover, the Grid environment have been tested both against the possibility of uploading and accessing distributed datasets through the Grid middleware and against its ability in managing the execution of jobs on distributed computational resources. Results from the Grid test will be discussed in a further paper. |
format | Text |
id | pubmed-1885859 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2007 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-18858592007-06-05 A Grid-based solution for management and analysis of microarrays in distributed experiments Porro, Ivan Torterolo, Livia Corradi, Luca Fato, Marco Papadimitropoulos, Adam Scaglione, Silvia Schenone, Andrea Viti, Federica BMC Bioinformatics Research Several systems have been presented in the last years in order to manage the complexity of large microarray experiments. Although good results have been achieved, most systems tend to lack in one or more fields. A Grid based approach may provide a shared, standardized and reliable solution for storage and analysis of biological data, in order to maximize the results of experimental efforts. A Grid framework has been therefore adopted due to the necessity of remotely accessing large amounts of distributed data as well as to scale computational performances for terabyte datasets. Two different biological studies have been planned in order to highlight the benefits that can emerge from our Grid based platform. The described environment relies on storage services and computational services provided by the gLite Grid middleware. The Grid environment is also able to exploit the added value of metadata in order to let users better classify and search experiments. A state-of-art Grid portal has been implemented in order to hide the complexity of framework from end users and to make them able to easily access available services and data. The functional architecture of the portal is described. As a first test of the system performances, a gene expression analysis has been performed on a dataset of Affymetrix GeneChip(® )Rat Expression Array RAE230A, from the ArrayExpress database. The sequence of analysis includes three steps: (i) group opening and image set uploading, (ii) normalization, and (iii) model based gene expression (based on PM/MM difference model). Two different Linux versions (sequential and parallel) of the dChip software have been developed to implement the analysis and have been tested on a cluster. From results, it emerges that the parallelization of the analysis process and the execution of parallel jobs on distributed computational resources actually improve the performances. Moreover, the Grid environment have been tested both against the possibility of uploading and accessing distributed datasets through the Grid middleware and against its ability in managing the execution of jobs on distributed computational resources. Results from the Grid test will be discussed in a further paper. BioMed Central 2007-03-08 /pmc/articles/PMC1885859/ /pubmed/17430574 http://dx.doi.org/10.1186/1471-2105-8-S1-S7 Text en Copyright © 2007 Porro et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Porro, Ivan Torterolo, Livia Corradi, Luca Fato, Marco Papadimitropoulos, Adam Scaglione, Silvia Schenone, Andrea Viti, Federica A Grid-based solution for management and analysis of microarrays in distributed experiments |
title | A Grid-based solution for management and analysis of microarrays in distributed experiments |
title_full | A Grid-based solution for management and analysis of microarrays in distributed experiments |
title_fullStr | A Grid-based solution for management and analysis of microarrays in distributed experiments |
title_full_unstemmed | A Grid-based solution for management and analysis of microarrays in distributed experiments |
title_short | A Grid-based solution for management and analysis of microarrays in distributed experiments |
title_sort | grid-based solution for management and analysis of microarrays in distributed experiments |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1885859/ https://www.ncbi.nlm.nih.gov/pubmed/17430574 http://dx.doi.org/10.1186/1471-2105-8-S1-S7 |
work_keys_str_mv | AT porroivan agridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments AT torterololivia agridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments AT corradiluca agridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments AT fatomarco agridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments AT papadimitropoulosadam agridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments AT scaglionesilvia agridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments AT schenoneandrea agridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments AT vitifederica agridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments AT porroivan gridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments AT torterololivia gridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments AT corradiluca gridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments AT fatomarco gridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments AT papadimitropoulosadam gridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments AT scaglionesilvia gridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments AT schenoneandrea gridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments AT vitifederica gridbasedsolutionformanagementandanalysisofmicroarraysindistributedexperiments |