Cargando…

A novel model for protein sequence similarity analysis based on spectral radius

Advances in sequencing technologies led to rapid increase in the number and diversity of biological sequences, which facilitated development in the sequence research. In this paper, we present a new method for analyzing protein sequence similarity. We calculated the spectral radii of 20 amino acids...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Chuanyan, Gao, Rui, De Marinis, Yang, Zhang, Yusen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier Ltd. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7094169/
https://www.ncbi.nlm.nih.gov/pubmed/29524440
http://dx.doi.org/10.1016/j.jtbi.2018.03.001
_version_ 1783510413318553600
author Wu, Chuanyan
Gao, Rui
De Marinis, Yang
Zhang, Yusen
author_facet Wu, Chuanyan
Gao, Rui
De Marinis, Yang
Zhang, Yusen
author_sort Wu, Chuanyan
collection PubMed
description Advances in sequencing technologies led to rapid increase in the number and diversity of biological sequences, which facilitated development in the sequence research. In this paper, we present a new method for analyzing protein sequence similarity. We calculated the spectral radii of 20 amino acids (AAs) and put forward a novel 2-D graphical representation of protein sequences. To characterize protein sequences numerically, three groups of features were extracted and related to statistical, dynamics measurements and fluctuation complexity of the sequences. With the obtained feature vector, two models utilizing Gaussian Kernel similarity and Cosine similarity were built to measure the similarity between sequences. We applied our method to analyze the similarities/dissimilarities of four data sets. Both proposed models received consistent results with improvements when compared to that obtained by the ClustalW analysis. The novel approach we present in this study may therefore benefit protein research in medical and scientific fields.
format Online
Article
Text
id pubmed-7094169
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Elsevier Ltd.
record_format MEDLINE/PubMed
spelling pubmed-70941692020-03-25 A novel model for protein sequence similarity analysis based on spectral radius Wu, Chuanyan Gao, Rui De Marinis, Yang Zhang, Yusen J Theor Biol Article Advances in sequencing technologies led to rapid increase in the number and diversity of biological sequences, which facilitated development in the sequence research. In this paper, we present a new method for analyzing protein sequence similarity. We calculated the spectral radii of 20 amino acids (AAs) and put forward a novel 2-D graphical representation of protein sequences. To characterize protein sequences numerically, three groups of features were extracted and related to statistical, dynamics measurements and fluctuation complexity of the sequences. With the obtained feature vector, two models utilizing Gaussian Kernel similarity and Cosine similarity were built to measure the similarity between sequences. We applied our method to analyze the similarities/dissimilarities of four data sets. Both proposed models received consistent results with improvements when compared to that obtained by the ClustalW analysis. The novel approach we present in this study may therefore benefit protein research in medical and scientific fields. Elsevier Ltd. 2018-06-07 2018-03-07 /pmc/articles/PMC7094169/ /pubmed/29524440 http://dx.doi.org/10.1016/j.jtbi.2018.03.001 Text en © 2018 Elsevier Ltd. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle Article
Wu, Chuanyan
Gao, Rui
De Marinis, Yang
Zhang, Yusen
A novel model for protein sequence similarity analysis based on spectral radius
title A novel model for protein sequence similarity analysis based on spectral radius
title_full A novel model for protein sequence similarity analysis based on spectral radius
title_fullStr A novel model for protein sequence similarity analysis based on spectral radius
title_full_unstemmed A novel model for protein sequence similarity analysis based on spectral radius
title_short A novel model for protein sequence similarity analysis based on spectral radius
title_sort novel model for protein sequence similarity analysis based on spectral radius
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7094169/
https://www.ncbi.nlm.nih.gov/pubmed/29524440
http://dx.doi.org/10.1016/j.jtbi.2018.03.001
work_keys_str_mv AT wuchuanyan anovelmodelforproteinsequencesimilarityanalysisbasedonspectralradius
AT gaorui anovelmodelforproteinsequencesimilarityanalysisbasedonspectralradius
AT demarinisyang anovelmodelforproteinsequencesimilarityanalysisbasedonspectralradius
AT zhangyusen anovelmodelforproteinsequencesimilarityanalysisbasedonspectralradius
AT wuchuanyan novelmodelforproteinsequencesimilarityanalysisbasedonspectralradius
AT gaorui novelmodelforproteinsequencesimilarityanalysisbasedonspectralradius
AT demarinisyang novelmodelforproteinsequencesimilarityanalysisbasedonspectralradius
AT zhangyusen novelmodelforproteinsequencesimilarityanalysisbasedonspectralradius