Cargando…

EvDTree: structure-dependent substitution profiles based on decision tree classification of 3D environments

BACKGROUND: Structure-dependent substitution matrices increase the accuracy of sequence alignments when the 3D structure of one sequence is known, and are successful e.g. in fold recognition. We propose a new automated method, EvDTree, based on a decision tree algorithm, for automatic derivation of...

Descripción completa

Detalles Bibliográficos
Autores principales: Gelly, Jean-Christophe, Chiche, Laurent, Gracy, Jérôme
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC545998/
https://www.ncbi.nlm.nih.gov/pubmed/15638949
http://dx.doi.org/10.1186/1471-2105-6-4
_version_ 1782122228504068096
author Gelly, Jean-Christophe
Chiche, Laurent
Gracy, Jérôme
author_facet Gelly, Jean-Christophe
Chiche, Laurent
Gracy, Jérôme
author_sort Gelly, Jean-Christophe
collection PubMed
description BACKGROUND: Structure-dependent substitution matrices increase the accuracy of sequence alignments when the 3D structure of one sequence is known, and are successful e.g. in fold recognition. We propose a new automated method, EvDTree, based on a decision tree algorithm, for automatic derivation of amino acid substitution probabilities from a set of sequence-structure alignments. The main advantage over other approaches is an unbiased automatic selection of the most informative structural descriptors and associated values or thresholds. This feature allows automatic derivation of structure-dependent substitution scores for any specific set of structures, without the need to empirically determine best descriptors and parameters. RESULTS: Decision trees for residue substitutions were constructed for each residue type from sequence-structure alignments extracted from the HOMSTRAD database. For each tree cluster, environment-dependent substitution profiles were derived. The resulting structure-dependent substitution scores were assessed using a criterion based on the mean ranking of observed substitution among all possible substitutions and in sequence-structure alignments. The automatically built EvDTree substitution scores provide significantly better results than conventional matrices and similar or slightly better results than other structure-dependent matrices. EvDTree has been applied to small disulfide-rich proteins as a test case to automatically derive specific substitutions scores providing better results than non-specific substitution scores. Analyses of the decision tree classifications provide useful information on the relative importance of different structural descriptors. CONCLUSIONS: We propose a fully automatic method for the classification of structural environments and inference of structure-dependent substitution profiles. We show that this approach is more accurate than existing methods for various applications. The easy adaptation of EvDTree to any specific data set opens the way for class-specific structure-dependent substitution scores which can be used in threading-based remote homology searches.
format Text
id pubmed-545998
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-5459982005-01-29 EvDTree: structure-dependent substitution profiles based on decision tree classification of 3D environments Gelly, Jean-Christophe Chiche, Laurent Gracy, Jérôme BMC Bioinformatics Methodology Article BACKGROUND: Structure-dependent substitution matrices increase the accuracy of sequence alignments when the 3D structure of one sequence is known, and are successful e.g. in fold recognition. We propose a new automated method, EvDTree, based on a decision tree algorithm, for automatic derivation of amino acid substitution probabilities from a set of sequence-structure alignments. The main advantage over other approaches is an unbiased automatic selection of the most informative structural descriptors and associated values or thresholds. This feature allows automatic derivation of structure-dependent substitution scores for any specific set of structures, without the need to empirically determine best descriptors and parameters. RESULTS: Decision trees for residue substitutions were constructed for each residue type from sequence-structure alignments extracted from the HOMSTRAD database. For each tree cluster, environment-dependent substitution profiles were derived. The resulting structure-dependent substitution scores were assessed using a criterion based on the mean ranking of observed substitution among all possible substitutions and in sequence-structure alignments. The automatically built EvDTree substitution scores provide significantly better results than conventional matrices and similar or slightly better results than other structure-dependent matrices. EvDTree has been applied to small disulfide-rich proteins as a test case to automatically derive specific substitutions scores providing better results than non-specific substitution scores. Analyses of the decision tree classifications provide useful information on the relative importance of different structural descriptors. CONCLUSIONS: We propose a fully automatic method for the classification of structural environments and inference of structure-dependent substitution profiles. We show that this approach is more accurate than existing methods for various applications. The easy adaptation of EvDTree to any specific data set opens the way for class-specific structure-dependent substitution scores which can be used in threading-based remote homology searches. BioMed Central 2005-01-10 /pmc/articles/PMC545998/ /pubmed/15638949 http://dx.doi.org/10.1186/1471-2105-6-4 Text en Copyright © 2005 Gelly et al; licensee BioMed Central Ltd.
spellingShingle Methodology Article
Gelly, Jean-Christophe
Chiche, Laurent
Gracy, Jérôme
EvDTree: structure-dependent substitution profiles based on decision tree classification of 3D environments
title EvDTree: structure-dependent substitution profiles based on decision tree classification of 3D environments
title_full EvDTree: structure-dependent substitution profiles based on decision tree classification of 3D environments
title_fullStr EvDTree: structure-dependent substitution profiles based on decision tree classification of 3D environments
title_full_unstemmed EvDTree: structure-dependent substitution profiles based on decision tree classification of 3D environments
title_short EvDTree: structure-dependent substitution profiles based on decision tree classification of 3D environments
title_sort evdtree: structure-dependent substitution profiles based on decision tree classification of 3d environments
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC545998/
https://www.ncbi.nlm.nih.gov/pubmed/15638949
http://dx.doi.org/10.1186/1471-2105-6-4
work_keys_str_mv AT gellyjeanchristophe evdtreestructuredependentsubstitutionprofilesbasedondecisiontreeclassificationof3denvironments
AT chichelaurent evdtreestructuredependentsubstitutionprofilesbasedondecisiontreeclassificationof3denvironments
AT gracyjerome evdtreestructuredependentsubstitutionprofilesbasedondecisiontreeclassificationof3denvironments