Cargando…

Development of a novel clustering tool for linear peptide sequences

Epitopes identified in large‐scale screens of overlapping peptides often share significant levels of sequence identity, complicating the analysis of epitope‐related data. Clustering algorithms are often used to facilitate these analyses, but available methods are generally insufficient in their capa...

Descripción completa

Detalles Bibliográficos
Autores principales: Dhanda, Sandeep K., Vaughan, Kerrie, Schulten, Veronique, Grifoni, Alba, Weiskopf, Daniela, Sidney, John, Peters, Bjoern, Sette, Alessandro
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6187223/
https://www.ncbi.nlm.nih.gov/pubmed/30014462
http://dx.doi.org/10.1111/imm.12984
_version_ 1783362976904904704
author Dhanda, Sandeep K.
Vaughan, Kerrie
Schulten, Veronique
Grifoni, Alba
Weiskopf, Daniela
Sidney, John
Peters, Bjoern
Sette, Alessandro
author_facet Dhanda, Sandeep K.
Vaughan, Kerrie
Schulten, Veronique
Grifoni, Alba
Weiskopf, Daniela
Sidney, John
Peters, Bjoern
Sette, Alessandro
author_sort Dhanda, Sandeep K.
collection PubMed
description Epitopes identified in large‐scale screens of overlapping peptides often share significant levels of sequence identity, complicating the analysis of epitope‐related data. Clustering algorithms are often used to facilitate these analyses, but available methods are generally insufficient in their capacity to define biologically meaningful epitope clusters in the context of the immune response. To fulfil this need we developed an algorithm that generates epitope clusters based on representative or consensus sequences. This tool allows the user to cluster peptide sequences on the basis of a specified level of identity by selecting among three different method options. These include the ‘clique method’, in which all members of the cluster must share the same minimal level of identity with each other, and the ‘connected graph method’, in which all members of a cluster must share a defined level of identity with at least one other member of the cluster. In cases where it is not possible to define a clear consensus sequence with the connected graph method, a third option provides a novel ‘cluster‐breaking algorithm’ for consensus sequence driven sub‐clustering. Herein we demonstrate the tool's clustering performance and applicability using (i) a selection of dengue virus epitopes for the ‘clique method’, (ii) sets of allergen‐derived peptides from related species for the ‘connected graph method’ and (iii) large data sets of eluted ligand, major histocompatibility complex binding and T‐cell recognition data captured within the Immune Epitope Database (IEDB) with the newly developed ‘cluster‐breaking algorithm’. This novel clustering tool is accessible at http://tools.iedb.org/cluster2/.
format Online
Article
Text
id pubmed-6187223
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-61872232018-10-22 Development of a novel clustering tool for linear peptide sequences Dhanda, Sandeep K. Vaughan, Kerrie Schulten, Veronique Grifoni, Alba Weiskopf, Daniela Sidney, John Peters, Bjoern Sette, Alessandro Immunology Original Articles Epitopes identified in large‐scale screens of overlapping peptides often share significant levels of sequence identity, complicating the analysis of epitope‐related data. Clustering algorithms are often used to facilitate these analyses, but available methods are generally insufficient in their capacity to define biologically meaningful epitope clusters in the context of the immune response. To fulfil this need we developed an algorithm that generates epitope clusters based on representative or consensus sequences. This tool allows the user to cluster peptide sequences on the basis of a specified level of identity by selecting among three different method options. These include the ‘clique method’, in which all members of the cluster must share the same minimal level of identity with each other, and the ‘connected graph method’, in which all members of a cluster must share a defined level of identity with at least one other member of the cluster. In cases where it is not possible to define a clear consensus sequence with the connected graph method, a third option provides a novel ‘cluster‐breaking algorithm’ for consensus sequence driven sub‐clustering. Herein we demonstrate the tool's clustering performance and applicability using (i) a selection of dengue virus epitopes for the ‘clique method’, (ii) sets of allergen‐derived peptides from related species for the ‘connected graph method’ and (iii) large data sets of eluted ligand, major histocompatibility complex binding and T‐cell recognition data captured within the Immune Epitope Database (IEDB) with the newly developed ‘cluster‐breaking algorithm’. This novel clustering tool is accessible at http://tools.iedb.org/cluster2/. John Wiley and Sons Inc. 2018-08-06 2018-11 /pmc/articles/PMC6187223/ /pubmed/30014462 http://dx.doi.org/10.1111/imm.12984 Text en © 2018 The Authors. Immunology Published by John Wiley & Sons Ltd. This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.
spellingShingle Original Articles
Dhanda, Sandeep K.
Vaughan, Kerrie
Schulten, Veronique
Grifoni, Alba
Weiskopf, Daniela
Sidney, John
Peters, Bjoern
Sette, Alessandro
Development of a novel clustering tool for linear peptide sequences
title Development of a novel clustering tool for linear peptide sequences
title_full Development of a novel clustering tool for linear peptide sequences
title_fullStr Development of a novel clustering tool for linear peptide sequences
title_full_unstemmed Development of a novel clustering tool for linear peptide sequences
title_short Development of a novel clustering tool for linear peptide sequences
title_sort development of a novel clustering tool for linear peptide sequences
topic Original Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6187223/
https://www.ncbi.nlm.nih.gov/pubmed/30014462
http://dx.doi.org/10.1111/imm.12984
work_keys_str_mv AT dhandasandeepk developmentofanovelclusteringtoolforlinearpeptidesequences
AT vaughankerrie developmentofanovelclusteringtoolforlinearpeptidesequences
AT schultenveronique developmentofanovelclusteringtoolforlinearpeptidesequences
AT grifonialba developmentofanovelclusteringtoolforlinearpeptidesequences
AT weiskopfdaniela developmentofanovelclusteringtoolforlinearpeptidesequences
AT sidneyjohn developmentofanovelclusteringtoolforlinearpeptidesequences
AT petersbjoern developmentofanovelclusteringtoolforlinearpeptidesequences
AT settealessandro developmentofanovelclusteringtoolforlinearpeptidesequences