Cargando…
A Statistical Similarity/Dissimilarity Analysis of Protein Sequences Based on a Novel Group Representative Vector
Similarity/dissimilarity analysis is a key way of understanding the biology of an organism by knowing the origin of the new genes/sequences. Sequence data are grouped in terms of biological relationships. The number of sequences related to any group is susceptible to be increased every day. All the...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6530227/ https://www.ncbi.nlm.nih.gov/pubmed/31205946 http://dx.doi.org/10.1155/2019/8702968 |
_version_ | 1783420588167004160 |
---|---|
author | Abd Elwahaab, Marwa A. Abo-Elkhier, Mervat M. Abo el Maaty, Moheb I. |
author_facet | Abd Elwahaab, Marwa A. Abo-Elkhier, Mervat M. Abo el Maaty, Moheb I. |
author_sort | Abd Elwahaab, Marwa A. |
collection | PubMed |
description | Similarity/dissimilarity analysis is a key way of understanding the biology of an organism by knowing the origin of the new genes/sequences. Sequence data are grouped in terms of biological relationships. The number of sequences related to any group is susceptible to be increased every day. All the present alignment-free methods approve the utility of their approaches by producing a similarity/dissimilarity matrix. Although this matrix is clear, it measures the degree of similarity among sequences individually. In our work, a representative of each of three groups of protein sequences is introduced. A similarity/dissimilarity vector is evaluated instead of the ordinary similarity/dissimilarity matrix based on the group representative. The approach is applied on three selected groups of protein sequences: beta globin, NADH dehydrogenase subunit 5 (ND5), and spike protein sequences. A cross-grouping comparison is produced to ensure the singularity of each group. A qualitative comparison between our approach, previous articles, and the phylogenetic tree of these protein sequences proved the utility of our approach. |
format | Online Article Text |
id | pubmed-6530227 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-65302272019-06-16 A Statistical Similarity/Dissimilarity Analysis of Protein Sequences Based on a Novel Group Representative Vector Abd Elwahaab, Marwa A. Abo-Elkhier, Mervat M. Abo el Maaty, Moheb I. Biomed Res Int Research Article Similarity/dissimilarity analysis is a key way of understanding the biology of an organism by knowing the origin of the new genes/sequences. Sequence data are grouped in terms of biological relationships. The number of sequences related to any group is susceptible to be increased every day. All the present alignment-free methods approve the utility of their approaches by producing a similarity/dissimilarity matrix. Although this matrix is clear, it measures the degree of similarity among sequences individually. In our work, a representative of each of three groups of protein sequences is introduced. A similarity/dissimilarity vector is evaluated instead of the ordinary similarity/dissimilarity matrix based on the group representative. The approach is applied on three selected groups of protein sequences: beta globin, NADH dehydrogenase subunit 5 (ND5), and spike protein sequences. A cross-grouping comparison is produced to ensure the singularity of each group. A qualitative comparison between our approach, previous articles, and the phylogenetic tree of these protein sequences proved the utility of our approach. Hindawi 2019-05-08 /pmc/articles/PMC6530227/ /pubmed/31205946 http://dx.doi.org/10.1155/2019/8702968 Text en Copyright © 2019 Marwa A. Abd Elwahaab et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Abd Elwahaab, Marwa A. Abo-Elkhier, Mervat M. Abo el Maaty, Moheb I. A Statistical Similarity/Dissimilarity Analysis of Protein Sequences Based on a Novel Group Representative Vector |
title | A Statistical Similarity/Dissimilarity Analysis of Protein Sequences Based on a Novel Group Representative Vector |
title_full | A Statistical Similarity/Dissimilarity Analysis of Protein Sequences Based on a Novel Group Representative Vector |
title_fullStr | A Statistical Similarity/Dissimilarity Analysis of Protein Sequences Based on a Novel Group Representative Vector |
title_full_unstemmed | A Statistical Similarity/Dissimilarity Analysis of Protein Sequences Based on a Novel Group Representative Vector |
title_short | A Statistical Similarity/Dissimilarity Analysis of Protein Sequences Based on a Novel Group Representative Vector |
title_sort | statistical similarity/dissimilarity analysis of protein sequences based on a novel group representative vector |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6530227/ https://www.ncbi.nlm.nih.gov/pubmed/31205946 http://dx.doi.org/10.1155/2019/8702968 |
work_keys_str_mv | AT abdelwahaabmarwaa astatisticalsimilaritydissimilarityanalysisofproteinsequencesbasedonanovelgrouprepresentativevector AT aboelkhiermervatm astatisticalsimilaritydissimilarityanalysisofproteinsequencesbasedonanovelgrouprepresentativevector AT aboelmaatymohebi astatisticalsimilaritydissimilarityanalysisofproteinsequencesbasedonanovelgrouprepresentativevector AT abdelwahaabmarwaa statisticalsimilaritydissimilarityanalysisofproteinsequencesbasedonanovelgrouprepresentativevector AT aboelkhiermervatm statisticalsimilaritydissimilarityanalysisofproteinsequencesbasedonanovelgrouprepresentativevector AT aboelmaatymohebi statisticalsimilaritydissimilarityanalysisofproteinsequencesbasedonanovelgrouprepresentativevector |