Cargando…

MUSI: an integrated system for identifying multiple specificity from very large peptide or nucleic acid data sets

Peptide recognition domains and transcription factors play crucial roles in cellular signaling. They bind linear stretches of amino acids or nucleotides, respectively, with high specificity. Experimental techniques that assess the binding specificity of these domains, such as microarrays or phage di...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, TaeHyung, Tyndel, Marc S., Huang, Haiming, Sidhu, Sachdev S., Bader, Gary D., Gfeller, David, Kim, Philip M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3315295/
https://www.ncbi.nlm.nih.gov/pubmed/22210894
http://dx.doi.org/10.1093/nar/gkr1294
_version_ 1782228208322609152
author Kim, TaeHyung
Tyndel, Marc S.
Huang, Haiming
Sidhu, Sachdev S.
Bader, Gary D.
Gfeller, David
Kim, Philip M.
author_facet Kim, TaeHyung
Tyndel, Marc S.
Huang, Haiming
Sidhu, Sachdev S.
Bader, Gary D.
Gfeller, David
Kim, Philip M.
author_sort Kim, TaeHyung
collection PubMed
description Peptide recognition domains and transcription factors play crucial roles in cellular signaling. They bind linear stretches of amino acids or nucleotides, respectively, with high specificity. Experimental techniques that assess the binding specificity of these domains, such as microarrays or phage display, can retrieve thousands of distinct ligands, providing detailed insight into binding specificity. In particular, the advent of next-generation sequencing has recently increased the throughput of such methods by several orders of magnitude. These advances have helped reveal the presence of distinct binding specificity classes that co-exist within a set of ligands interacting with the same target. Here, we introduce a software system called MUSI that can rapidly analyze very large data sets of binding sequences to determine the relevant binding specificity patterns. Our pipeline provides two major advances. First, it can detect previously unrecognized multiple specificity patterns in any data set. Second, it offers integrated processing of very large data sets from next-generation sequencing machines. The results are visualized as multiple sequence logos describing the different binding preferences of the protein under investigation. We demonstrate the performance of MUSI by analyzing recent phage display data for human SH3 domains as well as microarray data for mouse transcription factors.
format Online
Article
Text
id pubmed-3315295
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-33152952012-03-30 MUSI: an integrated system for identifying multiple specificity from very large peptide or nucleic acid data sets Kim, TaeHyung Tyndel, Marc S. Huang, Haiming Sidhu, Sachdev S. Bader, Gary D. Gfeller, David Kim, Philip M. Nucleic Acids Res Methods Online Peptide recognition domains and transcription factors play crucial roles in cellular signaling. They bind linear stretches of amino acids or nucleotides, respectively, with high specificity. Experimental techniques that assess the binding specificity of these domains, such as microarrays or phage display, can retrieve thousands of distinct ligands, providing detailed insight into binding specificity. In particular, the advent of next-generation sequencing has recently increased the throughput of such methods by several orders of magnitude. These advances have helped reveal the presence of distinct binding specificity classes that co-exist within a set of ligands interacting with the same target. Here, we introduce a software system called MUSI that can rapidly analyze very large data sets of binding sequences to determine the relevant binding specificity patterns. Our pipeline provides two major advances. First, it can detect previously unrecognized multiple specificity patterns in any data set. Second, it offers integrated processing of very large data sets from next-generation sequencing machines. The results are visualized as multiple sequence logos describing the different binding preferences of the protein under investigation. We demonstrate the performance of MUSI by analyzing recent phage display data for human SH3 domains as well as microarray data for mouse transcription factors. Oxford University Press 2012-03 2011-12-31 /pmc/articles/PMC3315295/ /pubmed/22210894 http://dx.doi.org/10.1093/nar/gkr1294 Text en © The Author(s) 2011. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Kim, TaeHyung
Tyndel, Marc S.
Huang, Haiming
Sidhu, Sachdev S.
Bader, Gary D.
Gfeller, David
Kim, Philip M.
MUSI: an integrated system for identifying multiple specificity from very large peptide or nucleic acid data sets
title MUSI: an integrated system for identifying multiple specificity from very large peptide or nucleic acid data sets
title_full MUSI: an integrated system for identifying multiple specificity from very large peptide or nucleic acid data sets
title_fullStr MUSI: an integrated system for identifying multiple specificity from very large peptide or nucleic acid data sets
title_full_unstemmed MUSI: an integrated system for identifying multiple specificity from very large peptide or nucleic acid data sets
title_short MUSI: an integrated system for identifying multiple specificity from very large peptide or nucleic acid data sets
title_sort musi: an integrated system for identifying multiple specificity from very large peptide or nucleic acid data sets
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3315295/
https://www.ncbi.nlm.nih.gov/pubmed/22210894
http://dx.doi.org/10.1093/nar/gkr1294
work_keys_str_mv AT kimtaehyung musianintegratedsystemforidentifyingmultiplespecificityfromverylargepeptideornucleicaciddatasets
AT tyndelmarcs musianintegratedsystemforidentifyingmultiplespecificityfromverylargepeptideornucleicaciddatasets
AT huanghaiming musianintegratedsystemforidentifyingmultiplespecificityfromverylargepeptideornucleicaciddatasets
AT sidhusachdevs musianintegratedsystemforidentifyingmultiplespecificityfromverylargepeptideornucleicaciddatasets
AT badergaryd musianintegratedsystemforidentifyingmultiplespecificityfromverylargepeptideornucleicaciddatasets
AT gfellerdavid musianintegratedsystemforidentifyingmultiplespecificityfromverylargepeptideornucleicaciddatasets
AT kimphilipm musianintegratedsystemforidentifyingmultiplespecificityfromverylargepeptideornucleicaciddatasets