Cargando…

Pairwise sequence similarity mapping with PaSiMap: Reclassification of immunoglobulin domains from titin as case study

Sequence comparison is critical for the functional assignment of newly identified protein genes. As uncharacterized protein sequences accumulate, there is an increasing need for sensitive tools for their classification. Here, we present a novel multidimensional scaling pipeline, PaSiMap, which creat...

Descripción completa

Detalles Bibliográficos
Autores principales: Su, Kathy, Mayans, Olga, Diederichs, Kay, Fleming, Jennifer R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Research Network of Computational and Structural Biotechnology 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9529554/
https://www.ncbi.nlm.nih.gov/pubmed/36212532
http://dx.doi.org/10.1016/j.csbj.2022.09.034
_version_ 1784801520793944064
author Su, Kathy
Mayans, Olga
Diederichs, Kay
Fleming, Jennifer R.
author_facet Su, Kathy
Mayans, Olga
Diederichs, Kay
Fleming, Jennifer R.
author_sort Su, Kathy
collection PubMed
description Sequence comparison is critical for the functional assignment of newly identified protein genes. As uncharacterized protein sequences accumulate, there is an increasing need for sensitive tools for their classification. Here, we present a novel multidimensional scaling pipeline, PaSiMap, which creates a map of pairwise sequence similarities. Uniquely, PaSiMap distinguishes between unique and shared features, allowing for a distinct view of protein-sequence relationships. We demonstrate PaSiMap’s efficiency in detecting sequence groups and outliers using titin’s 169 immunoglobulin (Ig) domains. We show that Ig domain similarity is hierarchical, being firstly determined by chain location, then by the loop features of the Ig fold and, finally, by super-repeat position. The existence of a previously unidentified domain repeat in the distal, constitutive I-band is revealed. Prototypic Igs, plus notable outliers, are identified and thereby domain classification improved. This re-classification can now guide future molecular research. In summary, we demonstrate that PaSiMap is a sensitive tool for the classification of protein sequences, which adds a new perspective in the understanding of inter-protein relationships. PaSiMap is applicable to any biological system defined by a linear sequence, including polynucleotide chains.
format Online
Article
Text
id pubmed-9529554
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Research Network of Computational and Structural Biotechnology
record_format MEDLINE/PubMed
spelling pubmed-95295542022-10-06 Pairwise sequence similarity mapping with PaSiMap: Reclassification of immunoglobulin domains from titin as case study Su, Kathy Mayans, Olga Diederichs, Kay Fleming, Jennifer R. Comput Struct Biotechnol J Research Article Sequence comparison is critical for the functional assignment of newly identified protein genes. As uncharacterized protein sequences accumulate, there is an increasing need for sensitive tools for their classification. Here, we present a novel multidimensional scaling pipeline, PaSiMap, which creates a map of pairwise sequence similarities. Uniquely, PaSiMap distinguishes between unique and shared features, allowing for a distinct view of protein-sequence relationships. We demonstrate PaSiMap’s efficiency in detecting sequence groups and outliers using titin’s 169 immunoglobulin (Ig) domains. We show that Ig domain similarity is hierarchical, being firstly determined by chain location, then by the loop features of the Ig fold and, finally, by super-repeat position. The existence of a previously unidentified domain repeat in the distal, constitutive I-band is revealed. Prototypic Igs, plus notable outliers, are identified and thereby domain classification improved. This re-classification can now guide future molecular research. In summary, we demonstrate that PaSiMap is a sensitive tool for the classification of protein sequences, which adds a new perspective in the understanding of inter-protein relationships. PaSiMap is applicable to any biological system defined by a linear sequence, including polynucleotide chains. Research Network of Computational and Structural Biotechnology 2022-09-26 /pmc/articles/PMC9529554/ /pubmed/36212532 http://dx.doi.org/10.1016/j.csbj.2022.09.034 Text en © 2022 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Research Article
Su, Kathy
Mayans, Olga
Diederichs, Kay
Fleming, Jennifer R.
Pairwise sequence similarity mapping with PaSiMap: Reclassification of immunoglobulin domains from titin as case study
title Pairwise sequence similarity mapping with PaSiMap: Reclassification of immunoglobulin domains from titin as case study
title_full Pairwise sequence similarity mapping with PaSiMap: Reclassification of immunoglobulin domains from titin as case study
title_fullStr Pairwise sequence similarity mapping with PaSiMap: Reclassification of immunoglobulin domains from titin as case study
title_full_unstemmed Pairwise sequence similarity mapping with PaSiMap: Reclassification of immunoglobulin domains from titin as case study
title_short Pairwise sequence similarity mapping with PaSiMap: Reclassification of immunoglobulin domains from titin as case study
title_sort pairwise sequence similarity mapping with pasimap: reclassification of immunoglobulin domains from titin as case study
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9529554/
https://www.ncbi.nlm.nih.gov/pubmed/36212532
http://dx.doi.org/10.1016/j.csbj.2022.09.034
work_keys_str_mv AT sukathy pairwisesequencesimilaritymappingwithpasimapreclassificationofimmunoglobulindomainsfromtitinascasestudy
AT mayansolga pairwisesequencesimilaritymappingwithpasimapreclassificationofimmunoglobulindomainsfromtitinascasestudy
AT diederichskay pairwisesequencesimilaritymappingwithpasimapreclassificationofimmunoglobulindomainsfromtitinascasestudy
AT flemingjenniferr pairwisesequencesimilaritymappingwithpasimapreclassificationofimmunoglobulindomainsfromtitinascasestudy