Cargando…

An alignment-free domain architecture similarity search (ADASS) algorithm for inferring homology between multi-domain proteins

Annotations of the genes and their products are largely guided by inferring homology. Sequence similarity is the primary measure used for annotation purpose however, the domain content and order were given less importance albeit the fact that domain insertion, deletion, positional changes can bring...

Descripción completa

Detalles Bibliográficos
Autores principales:	Syamaladevi, Divya P, Joshi, Adwait, Sowdhamini, Ramanathan
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Biomedical Informatics 2013
Materias:	Hypothesis
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3705623/ https://www.ncbi.nlm.nih.gov/pubmed/23861564 http://dx.doi.org/10.6026/97320630009491

_version_	1782476468672004096
author	Syamaladevi, Divya P Joshi, Adwait Sowdhamini, Ramanathan
author_facet	Syamaladevi, Divya P Joshi, Adwait Sowdhamini, Ramanathan
author_sort	Syamaladevi, Divya P
collection	PubMed
description	Annotations of the genes and their products are largely guided by inferring homology. Sequence similarity is the primary measure used for annotation purpose however, the domain content and order were given less importance albeit the fact that domain insertion, deletion, positional changes can bring in functional varieties. Of late, several methods developed quantify domain architecture similarity depending on alignments of their sequences and are focused on only homologous proteins. We present an alignment-free domain architecture-similarity search (ADASS) algorithm that identifies proteins that share very poor sequence similarity yet having similar domain architectures. We introduce a “singlet matching-triplet comparison” method in ADASS, wherein triplet of domains is compared with other triplets in a pair-wise comparison of two domain architectures. Different events in the triplet comparison are scored as per a scoring scheme and an average pairwise distance score (Domain Architecture Distance score - DAD Score) is calculated between protein domains architectures. We use domain architectures of a selected domain termed as centric domain and cluster them based on DAD score. The algorithm has high Positive Prediction Value (PPV) with respect to the clustering of the sequences of selected domain architectures. A comparison of domain architecture based dendrograms using ADASS method and an existing method revealed that ADASS can classify proteins depending on the extent of domain architecture level similarity. ADASS is more relevant in cases of proteins with tiny domains having little contribution to the overall sequence similarity but contributing significantly to the overall function.
format	Online Article Text
id	pubmed-3705623
institution	National Center for Biotechnology Information
language	English
publishDate	2013
publisher	Biomedical Informatics
record_format	MEDLINE/PubMed
spelling	pubmed-37056232013-07-16 An alignment-free domain architecture similarity search (ADASS) algorithm for inferring homology between multi-domain proteins Syamaladevi, Divya P Joshi, Adwait Sowdhamini, Ramanathan Bioinformation Hypothesis Annotations of the genes and their products are largely guided by inferring homology. Sequence similarity is the primary measure used for annotation purpose however, the domain content and order were given less importance albeit the fact that domain insertion, deletion, positional changes can bring in functional varieties. Of late, several methods developed quantify domain architecture similarity depending on alignments of their sequences and are focused on only homologous proteins. We present an alignment-free domain architecture-similarity search (ADASS) algorithm that identifies proteins that share very poor sequence similarity yet having similar domain architectures. We introduce a “singlet matching-triplet comparison” method in ADASS, wherein triplet of domains is compared with other triplets in a pair-wise comparison of two domain architectures. Different events in the triplet comparison are scored as per a scoring scheme and an average pairwise distance score (Domain Architecture Distance score - DAD Score) is calculated between protein domains architectures. We use domain architectures of a selected domain termed as centric domain and cluster them based on DAD score. The algorithm has high Positive Prediction Value (PPV) with respect to the clustering of the sequences of selected domain architectures. A comparison of domain architecture based dendrograms using ADASS method and an existing method revealed that ADASS can classify proteins depending on the extent of domain architecture level similarity. ADASS is more relevant in cases of proteins with tiny domains having little contribution to the overall sequence similarity but contributing significantly to the overall function. Biomedical Informatics 2013-06-08 /pmc/articles/PMC3705623/ /pubmed/23861564 http://dx.doi.org/10.6026/97320630009491 Text en © 2013 Biomedical Informatics This is an open-access article, which permits unrestricted use, distribution, and reproduction in any medium, for non-commercial purposes, provided the original author and source are credited.
spellingShingle	Hypothesis Syamaladevi, Divya P Joshi, Adwait Sowdhamini, Ramanathan An alignment-free domain architecture similarity search (ADASS) algorithm for inferring homology between multi-domain proteins
title	An alignment-free domain architecture similarity search (ADASS) algorithm for inferring homology between multi-domain proteins
title_full	An alignment-free domain architecture similarity search (ADASS) algorithm for inferring homology between multi-domain proteins
title_fullStr	An alignment-free domain architecture similarity search (ADASS) algorithm for inferring homology between multi-domain proteins
title_full_unstemmed	An alignment-free domain architecture similarity search (ADASS) algorithm for inferring homology between multi-domain proteins
title_short	An alignment-free domain architecture similarity search (ADASS) algorithm for inferring homology between multi-domain proteins
title_sort	alignment-free domain architecture similarity search (adass) algorithm for inferring homology between multi-domain proteins
topic	Hypothesis
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3705623/ https://www.ncbi.nlm.nih.gov/pubmed/23861564 http://dx.doi.org/10.6026/97320630009491
work_keys_str_mv	AT syamaladevidivyap analignmentfreedomainarchitecturesimilaritysearchadassalgorithmforinferringhomologybetweenmultidomainproteins AT joshiadwait analignmentfreedomainarchitecturesimilaritysearchadassalgorithmforinferringhomologybetweenmultidomainproteins AT sowdhaminiramanathan analignmentfreedomainarchitecturesimilaritysearchadassalgorithmforinferringhomologybetweenmultidomainproteins AT syamaladevidivyap alignmentfreedomainarchitecturesimilaritysearchadassalgorithmforinferringhomologybetweenmultidomainproteins AT joshiadwait alignmentfreedomainarchitecturesimilaritysearchadassalgorithmforinferringhomologybetweenmultidomainproteins AT sowdhaminiramanathan alignmentfreedomainarchitecturesimilaritysearchadassalgorithmforinferringhomologybetweenmultidomainproteins

An alignment-free domain architecture similarity search (ADASS) algorithm for inferring homology between multi-domain proteins

Ejemplares similares