Cargando…

Assembling proteomics data as a prerequisite for the analysis of large scale experiments

BACKGROUND: Despite the complete determination of the genome sequence of a huge number of bacteria, their proteomes remain relatively poorly defined. Beside new methods to increase the number of identified proteins new database applications are necessary to store and present results of large- scale...

Descripción completa

Detalles Bibliográficos
Autores principales: Schmidt, Frank, Schmid, Monika, Thiede, Bernd, Pleißner, Klaus-Peter, Böhme, Martina, Jungblut, Peter R
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2653022/
https://www.ncbi.nlm.nih.gov/pubmed/19166578
http://dx.doi.org/10.1186/1752-153X-3-2
_version_ 1782165258828251136
author Schmidt, Frank
Schmid, Monika
Thiede, Bernd
Pleißner, Klaus-Peter
Böhme, Martina
Jungblut, Peter R
author_facet Schmidt, Frank
Schmid, Monika
Thiede, Bernd
Pleißner, Klaus-Peter
Böhme, Martina
Jungblut, Peter R
author_sort Schmidt, Frank
collection PubMed
description BACKGROUND: Despite the complete determination of the genome sequence of a huge number of bacteria, their proteomes remain relatively poorly defined. Beside new methods to increase the number of identified proteins new database applications are necessary to store and present results of large- scale proteomics experiments. RESULTS: In the present study, a database concept has been developed to address these issues and to offer complete information via a web interface. In our concept, the Oracle based data repository system SQL-LIMS plays the central role in the proteomics workflow and was applied to the proteomes of Mycobacterium tuberculosis, Helicobacter pylori, Salmonella typhimurium and protein complexes such as 20S proteasome. Technical operations of our proteomics labs were used as the standard for SQL-LIMS template creation. By means of a Java based data parser, post-processed data of different approaches, such as LC/ESI-MS, MALDI-MS and 2-D gel electrophoresis (2-DE), were stored in SQL-LIMS. A minimum set of the proteomics data were transferred in our public 2D-PAGE database using a Java based interface (Data Transfer Tool) with the requirements of the PEDRo standardization. Furthermore, the stored proteomics data were extractable out of SQL-LIMS via XML. CONCLUSION: The Oracle based data repository system SQL-LIMS played the central role in the proteomics workflow concept. Technical operations of our proteomics labs were used as standards for SQL-LIMS templates. Using a Java based parser, post-processed data of different approaches such as LC/ESI-MS, MALDI-MS and 1-DE and 2-DE were stored in SQL-LIMS. Thus, unique data formats of different instruments were unified and stored in SQL-LIMS tables. Moreover, a unique submission identifier allowed fast access to all experimental data. This was the main advantage compared to multi software solutions, especially if personnel fluctuations are high. Moreover, large scale and high-throughput experiments must be managed in a comprehensive repository system such as SQL-LIMS, to query results in a systematic manner. On the other hand, these database systems are expensive and require at least one full time administrator and specialized lab manager. Moreover, the high technical dynamics in proteomics may cause problems to adjust new data formats. To summarize, SQL-LIMS met the requirements of proteomics data handling especially in skilled processes such as gel-electrophoresis or mass spectrometry and fulfilled the PSI standardization criteria. The data transfer into a public domain via DTT facilitated validation of proteomics data. Additionally, evaluation of mass spectra by post-processing using MS-Screener improved the reliability of mass analysis and prevented storage of data junk.
format Text
id pubmed-2653022
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26530222009-03-10 Assembling proteomics data as a prerequisite for the analysis of large scale experiments Schmidt, Frank Schmid, Monika Thiede, Bernd Pleißner, Klaus-Peter Böhme, Martina Jungblut, Peter R Chem Cent J Research Article BACKGROUND: Despite the complete determination of the genome sequence of a huge number of bacteria, their proteomes remain relatively poorly defined. Beside new methods to increase the number of identified proteins new database applications are necessary to store and present results of large- scale proteomics experiments. RESULTS: In the present study, a database concept has been developed to address these issues and to offer complete information via a web interface. In our concept, the Oracle based data repository system SQL-LIMS plays the central role in the proteomics workflow and was applied to the proteomes of Mycobacterium tuberculosis, Helicobacter pylori, Salmonella typhimurium and protein complexes such as 20S proteasome. Technical operations of our proteomics labs were used as the standard for SQL-LIMS template creation. By means of a Java based data parser, post-processed data of different approaches, such as LC/ESI-MS, MALDI-MS and 2-D gel electrophoresis (2-DE), were stored in SQL-LIMS. A minimum set of the proteomics data were transferred in our public 2D-PAGE database using a Java based interface (Data Transfer Tool) with the requirements of the PEDRo standardization. Furthermore, the stored proteomics data were extractable out of SQL-LIMS via XML. CONCLUSION: The Oracle based data repository system SQL-LIMS played the central role in the proteomics workflow concept. Technical operations of our proteomics labs were used as standards for SQL-LIMS templates. Using a Java based parser, post-processed data of different approaches such as LC/ESI-MS, MALDI-MS and 1-DE and 2-DE were stored in SQL-LIMS. Thus, unique data formats of different instruments were unified and stored in SQL-LIMS tables. Moreover, a unique submission identifier allowed fast access to all experimental data. This was the main advantage compared to multi software solutions, especially if personnel fluctuations are high. Moreover, large scale and high-throughput experiments must be managed in a comprehensive repository system such as SQL-LIMS, to query results in a systematic manner. On the other hand, these database systems are expensive and require at least one full time administrator and specialized lab manager. Moreover, the high technical dynamics in proteomics may cause problems to adjust new data formats. To summarize, SQL-LIMS met the requirements of proteomics data handling especially in skilled processes such as gel-electrophoresis or mass spectrometry and fulfilled the PSI standardization criteria. The data transfer into a public domain via DTT facilitated validation of proteomics data. Additionally, evaluation of mass spectra by post-processing using MS-Screener improved the reliability of mass analysis and prevented storage of data junk. BioMed Central 2009-01-23 /pmc/articles/PMC2653022/ /pubmed/19166578 http://dx.doi.org/10.1186/1752-153X-3-2 Text en Copyright © 2009 Schmidt et al
spellingShingle Research Article
Schmidt, Frank
Schmid, Monika
Thiede, Bernd
Pleißner, Klaus-Peter
Böhme, Martina
Jungblut, Peter R
Assembling proteomics data as a prerequisite for the analysis of large scale experiments
title Assembling proteomics data as a prerequisite for the analysis of large scale experiments
title_full Assembling proteomics data as a prerequisite for the analysis of large scale experiments
title_fullStr Assembling proteomics data as a prerequisite for the analysis of large scale experiments
title_full_unstemmed Assembling proteomics data as a prerequisite for the analysis of large scale experiments
title_short Assembling proteomics data as a prerequisite for the analysis of large scale experiments
title_sort assembling proteomics data as a prerequisite for the analysis of large scale experiments
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2653022/
https://www.ncbi.nlm.nih.gov/pubmed/19166578
http://dx.doi.org/10.1186/1752-153X-3-2
work_keys_str_mv AT schmidtfrank assemblingproteomicsdataasaprerequisitefortheanalysisoflargescaleexperiments
AT schmidmonika assemblingproteomicsdataasaprerequisitefortheanalysisoflargescaleexperiments
AT thiedebernd assemblingproteomicsdataasaprerequisitefortheanalysisoflargescaleexperiments
AT pleißnerklauspeter assemblingproteomicsdataasaprerequisitefortheanalysisoflargescaleexperiments
AT bohmemartina assemblingproteomicsdataasaprerequisitefortheanalysisoflargescaleexperiments
AT jungblutpeterr assemblingproteomicsdataasaprerequisitefortheanalysisoflargescaleexperiments