Cargando…

SeqCP: A sequence-based algorithm for searching circularly permuted proteins

Circular permutation (CP) is a protein sequence rearrangement in which the amino- and carboxyl-termini of a protein can be created in different positions along the imaginary circularized sequence. Circularly permutated proteins usually exhibit conserved three-dimensional structures and functions. By...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Chi-Chun, Huang, Yu-Wei, Huang, Hsuan-Cheng, Lo, Wei-Cheng, Lyu, Ping-Chiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Research Network of Computational and Structural Biotechnology 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9763678/
https://www.ncbi.nlm.nih.gov/pubmed/36582435
http://dx.doi.org/10.1016/j.csbj.2022.11.024
_version_ 1784853111360192512
author Chen, Chi-Chun
Huang, Yu-Wei
Huang, Hsuan-Cheng
Lo, Wei-Cheng
Lyu, Ping-Chiang
author_facet Chen, Chi-Chun
Huang, Yu-Wei
Huang, Hsuan-Cheng
Lo, Wei-Cheng
Lyu, Ping-Chiang
author_sort Chen, Chi-Chun
collection PubMed
description Circular permutation (CP) is a protein sequence rearrangement in which the amino- and carboxyl-termini of a protein can be created in different positions along the imaginary circularized sequence. Circularly permutated proteins usually exhibit conserved three-dimensional structures and functions. By comparing the structures of circular permutants (CPMs), protein research and bioengineering applications can be approached in ways that are difficult to achieve by traditional mutagenesis. Most current CP detection algorithms depend on structural information. Because there is a vast number of proteins with unknown structures, many CP pairs may remain unidentified. An efficient sequence-based CP detector will help identify more CP pairs and advance many protein studies. For instance, some hypothetical proteins may have CPMs with known functions and structures that are informative for functional annotation, but existing structure-based CP search methods cannot be applied when those hypothetical proteins lack structural information. Despite the considerable potential for applications, sequence-based CP search methods have not been well developed. We present a sequence-based method, SeqCP, which analyzes normal and duplicated sequence alignments to identify CPMs and determine candidate CP sites for proteins. SeqCP was trained by data obtained from the Circular Permutation Database and tested with nonredundant datasets from the Protein Data Bank. It shows high reliability in CP identification and achieves an AUC of 0.9. SeqCP has been implemented into a web server available at: http://pcnas.life.nthu.edu.tw/SeqCP/.
format Online
Article
Text
id pubmed-9763678
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Research Network of Computational and Structural Biotechnology
record_format MEDLINE/PubMed
spelling pubmed-97636782022-12-28 SeqCP: A sequence-based algorithm for searching circularly permuted proteins Chen, Chi-Chun Huang, Yu-Wei Huang, Hsuan-Cheng Lo, Wei-Cheng Lyu, Ping-Chiang Comput Struct Biotechnol J Research Article Circular permutation (CP) is a protein sequence rearrangement in which the amino- and carboxyl-termini of a protein can be created in different positions along the imaginary circularized sequence. Circularly permutated proteins usually exhibit conserved three-dimensional structures and functions. By comparing the structures of circular permutants (CPMs), protein research and bioengineering applications can be approached in ways that are difficult to achieve by traditional mutagenesis. Most current CP detection algorithms depend on structural information. Because there is a vast number of proteins with unknown structures, many CP pairs may remain unidentified. An efficient sequence-based CP detector will help identify more CP pairs and advance many protein studies. For instance, some hypothetical proteins may have CPMs with known functions and structures that are informative for functional annotation, but existing structure-based CP search methods cannot be applied when those hypothetical proteins lack structural information. Despite the considerable potential for applications, sequence-based CP search methods have not been well developed. We present a sequence-based method, SeqCP, which analyzes normal and duplicated sequence alignments to identify CPMs and determine candidate CP sites for proteins. SeqCP was trained by data obtained from the Circular Permutation Database and tested with nonredundant datasets from the Protein Data Bank. It shows high reliability in CP identification and achieves an AUC of 0.9. SeqCP has been implemented into a web server available at: http://pcnas.life.nthu.edu.tw/SeqCP/. Research Network of Computational and Structural Biotechnology 2022-11-14 /pmc/articles/PMC9763678/ /pubmed/36582435 http://dx.doi.org/10.1016/j.csbj.2022.11.024 Text en © 2022 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Research Article
Chen, Chi-Chun
Huang, Yu-Wei
Huang, Hsuan-Cheng
Lo, Wei-Cheng
Lyu, Ping-Chiang
SeqCP: A sequence-based algorithm for searching circularly permuted proteins
title SeqCP: A sequence-based algorithm for searching circularly permuted proteins
title_full SeqCP: A sequence-based algorithm for searching circularly permuted proteins
title_fullStr SeqCP: A sequence-based algorithm for searching circularly permuted proteins
title_full_unstemmed SeqCP: A sequence-based algorithm for searching circularly permuted proteins
title_short SeqCP: A sequence-based algorithm for searching circularly permuted proteins
title_sort seqcp: a sequence-based algorithm for searching circularly permuted proteins
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9763678/
https://www.ncbi.nlm.nih.gov/pubmed/36582435
http://dx.doi.org/10.1016/j.csbj.2022.11.024
work_keys_str_mv AT chenchichun seqcpasequencebasedalgorithmforsearchingcircularlypermutedproteins
AT huangyuwei seqcpasequencebasedalgorithmforsearchingcircularlypermutedproteins
AT huanghsuancheng seqcpasequencebasedalgorithmforsearchingcircularlypermutedproteins
AT loweicheng seqcpasequencebasedalgorithmforsearchingcircularlypermutedproteins
AT lyupingchiang seqcpasequencebasedalgorithmforsearchingcircularlypermutedproteins