Cargando…

MapReduce implementation of a hybrid spectral library-database search method for large-scale peptide identification

Summary: A MapReduce-based implementation called MR-MSPolygraph for parallelizing peptide identification from mass spectrometry data is presented. The underlying serial method, MSPolygraph, uses a novel hybrid approach to match an experimental spectrum against a combination of a protein sequence dat...

Descripción completa

Detalles Bibliográficos
Autores principales: Kalyanaraman, Ananth, Cannon, William R., Latt, Benjamin, Baxter, Douglas J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3198583/
https://www.ncbi.nlm.nih.gov/pubmed/21926122
http://dx.doi.org/10.1093/bioinformatics/btr523
_version_ 1782214454062088192
author Kalyanaraman, Ananth
Cannon, William R.
Latt, Benjamin
Baxter, Douglas J.
author_facet Kalyanaraman, Ananth
Cannon, William R.
Latt, Benjamin
Baxter, Douglas J.
author_sort Kalyanaraman, Ananth
collection PubMed
description Summary: A MapReduce-based implementation called MR-MSPolygraph for parallelizing peptide identification from mass spectrometry data is presented. The underlying serial method, MSPolygraph, uses a novel hybrid approach to match an experimental spectrum against a combination of a protein sequence database and a spectral library. Our MapReduce implementation can run on any Hadoop cluster environment. Experimental results demonstrate that, relative to the serial version, MR-MSPolygraph reduces the time to solution from weeks to hours, for processing tens of thousands of experimental spectra. Speedup and other related performance studies are also reported on a 400-core Hadoop cluster using spectral datasets from environmental microbial communities as inputs. Availability: The source code along with user documentation are available on http://compbio.eecs.wsu.edu/MR-MSPolygraph. Contact: ananth@eecs.wsu.edu; william.cannon@pnnl.gov Supplementary Information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-3198583
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-31985832011-10-23 MapReduce implementation of a hybrid spectral library-database search method for large-scale peptide identification Kalyanaraman, Ananth Cannon, William R. Latt, Benjamin Baxter, Douglas J. Bioinformatics Applications Note Summary: A MapReduce-based implementation called MR-MSPolygraph for parallelizing peptide identification from mass spectrometry data is presented. The underlying serial method, MSPolygraph, uses a novel hybrid approach to match an experimental spectrum against a combination of a protein sequence database and a spectral library. Our MapReduce implementation can run on any Hadoop cluster environment. Experimental results demonstrate that, relative to the serial version, MR-MSPolygraph reduces the time to solution from weeks to hours, for processing tens of thousands of experimental spectra. Speedup and other related performance studies are also reported on a 400-core Hadoop cluster using spectral datasets from environmental microbial communities as inputs. Availability: The source code along with user documentation are available on http://compbio.eecs.wsu.edu/MR-MSPolygraph. Contact: ananth@eecs.wsu.edu; william.cannon@pnnl.gov Supplementary Information: Supplementary data are available at Bioinformatics online. Oxford University Press 2011-11-01 2011-09-16 /pmc/articles/PMC3198583/ /pubmed/21926122 http://dx.doi.org/10.1093/bioinformatics/btr523 Text en © The Author(s) 2011. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Applications Note
Kalyanaraman, Ananth
Cannon, William R.
Latt, Benjamin
Baxter, Douglas J.
MapReduce implementation of a hybrid spectral library-database search method for large-scale peptide identification
title MapReduce implementation of a hybrid spectral library-database search method for large-scale peptide identification
title_full MapReduce implementation of a hybrid spectral library-database search method for large-scale peptide identification
title_fullStr MapReduce implementation of a hybrid spectral library-database search method for large-scale peptide identification
title_full_unstemmed MapReduce implementation of a hybrid spectral library-database search method for large-scale peptide identification
title_short MapReduce implementation of a hybrid spectral library-database search method for large-scale peptide identification
title_sort mapreduce implementation of a hybrid spectral library-database search method for large-scale peptide identification
topic Applications Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3198583/
https://www.ncbi.nlm.nih.gov/pubmed/21926122
http://dx.doi.org/10.1093/bioinformatics/btr523
work_keys_str_mv AT kalyanaramanananth mapreduceimplementationofahybridspectrallibrarydatabasesearchmethodforlargescalepeptideidentification
AT cannonwilliamr mapreduceimplementationofahybridspectrallibrarydatabasesearchmethodforlargescalepeptideidentification
AT lattbenjamin mapreduceimplementationofahybridspectrallibrarydatabasesearchmethodforlargescalepeptideidentification
AT baxterdouglasj mapreduceimplementationofahybridspectrallibrarydatabasesearchmethodforlargescalepeptideidentification