Cargando…

Flexible and Accessible Workflows for Improved Proteogenomic Analysis Using the Galaxy Framework

[Image: see text] Proteogenomics combines large-scale genomic and transcriptomic data with mass-spectrometry-based proteomic data to discover novel protein sequence variants and improve genome annotation. In contrast with conventional proteomic applications, proteogenomic analysis requires a number...

Descripción completa

Detalles Bibliográficos
Autores principales: Jagtap, Pratik D., Johnson, James E., Onsongo, Getiria, Sadler, Fredrik W., Murray, Kevin, Wang, Yuanbo, Shenykman, Gloria M., Bandhakavi, Sricharan, Smith, Lloyd M., Griffin, Timothy J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2014
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4261978/
https://www.ncbi.nlm.nih.gov/pubmed/25301683
http://dx.doi.org/10.1021/pr500812t
_version_ 1782348363232968704
author Jagtap, Pratik D.
Johnson, James E.
Onsongo, Getiria
Sadler, Fredrik W.
Murray, Kevin
Wang, Yuanbo
Shenykman, Gloria M.
Bandhakavi, Sricharan
Smith, Lloyd M.
Griffin, Timothy J.
author_facet Jagtap, Pratik D.
Johnson, James E.
Onsongo, Getiria
Sadler, Fredrik W.
Murray, Kevin
Wang, Yuanbo
Shenykman, Gloria M.
Bandhakavi, Sricharan
Smith, Lloyd M.
Griffin, Timothy J.
author_sort Jagtap, Pratik D.
collection PubMed
description [Image: see text] Proteogenomics combines large-scale genomic and transcriptomic data with mass-spectrometry-based proteomic data to discover novel protein sequence variants and improve genome annotation. In contrast with conventional proteomic applications, proteogenomic analysis requires a number of additional data processing steps. Ideally, these required steps would be integrated and automated via a single software platform offering accessibility for wet-bench researchers as well as flexibility for user-specific customization and integration of new software tools as they emerge. Toward this end, we have extended the Galaxy bioinformatics framework to facilitate proteogenomic analysis. Using analysis of whole human saliva as an example, we demonstrate Galaxy’s flexibility through the creation of a modular workflow incorporating both established and customized software tools that improve depth and quality of proteogenomic results. Our customized Galaxy-based software includes automated, batch-mode BLASTP searching and a Peptide Sequence Match Evaluator tool, both useful for evaluating the veracity of putative novel peptide identifications. Our complex workflow (approximately 140 steps) can be easily shared using built-in Galaxy functions, enabling their use and customization by others. Our results provide a blueprint for the establishment of the Galaxy framework as an ideal solution for the emerging field of proteogenomics.
format Online
Article
Text
id pubmed-4261978
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-42619782015-10-10 Flexible and Accessible Workflows for Improved Proteogenomic Analysis Using the Galaxy Framework Jagtap, Pratik D. Johnson, James E. Onsongo, Getiria Sadler, Fredrik W. Murray, Kevin Wang, Yuanbo Shenykman, Gloria M. Bandhakavi, Sricharan Smith, Lloyd M. Griffin, Timothy J. J Proteome Res [Image: see text] Proteogenomics combines large-scale genomic and transcriptomic data with mass-spectrometry-based proteomic data to discover novel protein sequence variants and improve genome annotation. In contrast with conventional proteomic applications, proteogenomic analysis requires a number of additional data processing steps. Ideally, these required steps would be integrated and automated via a single software platform offering accessibility for wet-bench researchers as well as flexibility for user-specific customization and integration of new software tools as they emerge. Toward this end, we have extended the Galaxy bioinformatics framework to facilitate proteogenomic analysis. Using analysis of whole human saliva as an example, we demonstrate Galaxy’s flexibility through the creation of a modular workflow incorporating both established and customized software tools that improve depth and quality of proteogenomic results. Our customized Galaxy-based software includes automated, batch-mode BLASTP searching and a Peptide Sequence Match Evaluator tool, both useful for evaluating the veracity of putative novel peptide identifications. Our complex workflow (approximately 140 steps) can be easily shared using built-in Galaxy functions, enabling their use and customization by others. Our results provide a blueprint for the establishment of the Galaxy framework as an ideal solution for the emerging field of proteogenomics. American Chemical Society 2014-10-10 2014-12-05 /pmc/articles/PMC4261978/ /pubmed/25301683 http://dx.doi.org/10.1021/pr500812t Text en Copyright © 2014 American Chemical Society This is an open access article published under an ACS AuthorChoice License (http://pubs.acs.org/page/policy/authorchoice_termsofuse.html) , which permits copying and redistribution of the article or any adaptations for non-commercial purposes.
spellingShingle Jagtap, Pratik D.
Johnson, James E.
Onsongo, Getiria
Sadler, Fredrik W.
Murray, Kevin
Wang, Yuanbo
Shenykman, Gloria M.
Bandhakavi, Sricharan
Smith, Lloyd M.
Griffin, Timothy J.
Flexible and Accessible Workflows for Improved Proteogenomic Analysis Using the Galaxy Framework
title Flexible and Accessible Workflows for Improved Proteogenomic Analysis Using the Galaxy Framework
title_full Flexible and Accessible Workflows for Improved Proteogenomic Analysis Using the Galaxy Framework
title_fullStr Flexible and Accessible Workflows for Improved Proteogenomic Analysis Using the Galaxy Framework
title_full_unstemmed Flexible and Accessible Workflows for Improved Proteogenomic Analysis Using the Galaxy Framework
title_short Flexible and Accessible Workflows for Improved Proteogenomic Analysis Using the Galaxy Framework
title_sort flexible and accessible workflows for improved proteogenomic analysis using the galaxy framework
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4261978/
https://www.ncbi.nlm.nih.gov/pubmed/25301683
http://dx.doi.org/10.1021/pr500812t
work_keys_str_mv AT jagtappratikd flexibleandaccessibleworkflowsforimprovedproteogenomicanalysisusingthegalaxyframework
AT johnsonjamese flexibleandaccessibleworkflowsforimprovedproteogenomicanalysisusingthegalaxyframework
AT onsongogetiria flexibleandaccessibleworkflowsforimprovedproteogenomicanalysisusingthegalaxyframework
AT sadlerfredrikw flexibleandaccessibleworkflowsforimprovedproteogenomicanalysisusingthegalaxyframework
AT murraykevin flexibleandaccessibleworkflowsforimprovedproteogenomicanalysisusingthegalaxyframework
AT wangyuanbo flexibleandaccessibleworkflowsforimprovedproteogenomicanalysisusingthegalaxyframework
AT shenykmangloriam flexibleandaccessibleworkflowsforimprovedproteogenomicanalysisusingthegalaxyframework
AT bandhakavisricharan flexibleandaccessibleworkflowsforimprovedproteogenomicanalysisusingthegalaxyframework
AT smithlloydm flexibleandaccessibleworkflowsforimprovedproteogenomicanalysisusingthegalaxyframework
AT griffintimothyj flexibleandaccessibleworkflowsforimprovedproteogenomicanalysisusingthegalaxyframework