Cargando…

Measuring Similarity among Protein Sequences Using a New Descriptor

The comparison of protein sequences according to similarity is a fundamental aspect of today's biomedical research. With the developments of sequencing technologies, a large number of protein sequences increase exponentially in the public databases. Famous sequences' comparison methods are...

Descripción completa

Detalles Bibliográficos
Autores principales: Abo-Elkhier, Mervat M., Abd Elwahaab, Marwa A., Abo El Maaty, Moheb I.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6893242/
https://www.ncbi.nlm.nih.gov/pubmed/31886192
http://dx.doi.org/10.1155/2019/2796971
_version_ 1783476170382114816
author Abo-Elkhier, Mervat M.
Abd Elwahaab, Marwa A.
Abo El Maaty, Moheb I.
author_facet Abo-Elkhier, Mervat M.
Abd Elwahaab, Marwa A.
Abo El Maaty, Moheb I.
author_sort Abo-Elkhier, Mervat M.
collection PubMed
description The comparison of protein sequences according to similarity is a fundamental aspect of today's biomedical research. With the developments of sequencing technologies, a large number of protein sequences increase exponentially in the public databases. Famous sequences' comparison methods are alignment based. They generally give excellent results when the sequences under study are closely related and they are time consuming. Herein, a new alignment-free method is introduced. Our technique depends on a new graphical representation and descriptor. The graphical representation of protein sequence is a simple way to visualize protein sequences. The descriptor compresses the primary sequence into a single vector composed of only two values. Our approach gives good results with both short and long sequences within a little computation time. It is applied on nine beta globin, nine ND5 (NADH dehydrogenase subunit 5), and 24 spike protein sequences. Correlation and significance analyses are also introduced to compare our similarity/dissimilarity results with others' approaches, results, and sequence homology.
format Online
Article
Text
id pubmed-6893242
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-68932422019-12-29 Measuring Similarity among Protein Sequences Using a New Descriptor Abo-Elkhier, Mervat M. Abd Elwahaab, Marwa A. Abo El Maaty, Moheb I. Biomed Res Int Research Article The comparison of protein sequences according to similarity is a fundamental aspect of today's biomedical research. With the developments of sequencing technologies, a large number of protein sequences increase exponentially in the public databases. Famous sequences' comparison methods are alignment based. They generally give excellent results when the sequences under study are closely related and they are time consuming. Herein, a new alignment-free method is introduced. Our technique depends on a new graphical representation and descriptor. The graphical representation of protein sequence is a simple way to visualize protein sequences. The descriptor compresses the primary sequence into a single vector composed of only two values. Our approach gives good results with both short and long sequences within a little computation time. It is applied on nine beta globin, nine ND5 (NADH dehydrogenase subunit 5), and 24 spike protein sequences. Correlation and significance analyses are also introduced to compare our similarity/dissimilarity results with others' approaches, results, and sequence homology. Hindawi 2019-11-22 /pmc/articles/PMC6893242/ /pubmed/31886192 http://dx.doi.org/10.1155/2019/2796971 Text en Copyright © 2019 Mervat M. Abo-Elkhier et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Abo-Elkhier, Mervat M.
Abd Elwahaab, Marwa A.
Abo El Maaty, Moheb I.
Measuring Similarity among Protein Sequences Using a New Descriptor
title Measuring Similarity among Protein Sequences Using a New Descriptor
title_full Measuring Similarity among Protein Sequences Using a New Descriptor
title_fullStr Measuring Similarity among Protein Sequences Using a New Descriptor
title_full_unstemmed Measuring Similarity among Protein Sequences Using a New Descriptor
title_short Measuring Similarity among Protein Sequences Using a New Descriptor
title_sort measuring similarity among protein sequences using a new descriptor
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6893242/
https://www.ncbi.nlm.nih.gov/pubmed/31886192
http://dx.doi.org/10.1155/2019/2796971
work_keys_str_mv AT aboelkhiermervatm measuringsimilarityamongproteinsequencesusinganewdescriptor
AT abdelwahaabmarwaa measuringsimilarityamongproteinsequencesusinganewdescriptor
AT aboelmaatymohebi measuringsimilarityamongproteinsequencesusinganewdescriptor