Cargando…
EvDTree: structure-dependent substitution profiles based on decision tree classification of 3D environments
BACKGROUND: Structure-dependent substitution matrices increase the accuracy of sequence alignments when the 3D structure of one sequence is known, and are successful e.g. in fold recognition. We propose a new automated method, EvDTree, based on a decision tree algorithm, for automatic derivation of...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2005
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC545998/ https://www.ncbi.nlm.nih.gov/pubmed/15638949 http://dx.doi.org/10.1186/1471-2105-6-4 |
_version_ | 1782122228504068096 |
---|---|
author | Gelly, Jean-Christophe Chiche, Laurent Gracy, Jérôme |
author_facet | Gelly, Jean-Christophe Chiche, Laurent Gracy, Jérôme |
author_sort | Gelly, Jean-Christophe |
collection | PubMed |
description | BACKGROUND: Structure-dependent substitution matrices increase the accuracy of sequence alignments when the 3D structure of one sequence is known, and are successful e.g. in fold recognition. We propose a new automated method, EvDTree, based on a decision tree algorithm, for automatic derivation of amino acid substitution probabilities from a set of sequence-structure alignments. The main advantage over other approaches is an unbiased automatic selection of the most informative structural descriptors and associated values or thresholds. This feature allows automatic derivation of structure-dependent substitution scores for any specific set of structures, without the need to empirically determine best descriptors and parameters. RESULTS: Decision trees for residue substitutions were constructed for each residue type from sequence-structure alignments extracted from the HOMSTRAD database. For each tree cluster, environment-dependent substitution profiles were derived. The resulting structure-dependent substitution scores were assessed using a criterion based on the mean ranking of observed substitution among all possible substitutions and in sequence-structure alignments. The automatically built EvDTree substitution scores provide significantly better results than conventional matrices and similar or slightly better results than other structure-dependent matrices. EvDTree has been applied to small disulfide-rich proteins as a test case to automatically derive specific substitutions scores providing better results than non-specific substitution scores. Analyses of the decision tree classifications provide useful information on the relative importance of different structural descriptors. CONCLUSIONS: We propose a fully automatic method for the classification of structural environments and inference of structure-dependent substitution profiles. We show that this approach is more accurate than existing methods for various applications. The easy adaptation of EvDTree to any specific data set opens the way for class-specific structure-dependent substitution scores which can be used in threading-based remote homology searches. |
format | Text |
id | pubmed-545998 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2005 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-5459982005-01-29 EvDTree: structure-dependent substitution profiles based on decision tree classification of 3D environments Gelly, Jean-Christophe Chiche, Laurent Gracy, Jérôme BMC Bioinformatics Methodology Article BACKGROUND: Structure-dependent substitution matrices increase the accuracy of sequence alignments when the 3D structure of one sequence is known, and are successful e.g. in fold recognition. We propose a new automated method, EvDTree, based on a decision tree algorithm, for automatic derivation of amino acid substitution probabilities from a set of sequence-structure alignments. The main advantage over other approaches is an unbiased automatic selection of the most informative structural descriptors and associated values or thresholds. This feature allows automatic derivation of structure-dependent substitution scores for any specific set of structures, without the need to empirically determine best descriptors and parameters. RESULTS: Decision trees for residue substitutions were constructed for each residue type from sequence-structure alignments extracted from the HOMSTRAD database. For each tree cluster, environment-dependent substitution profiles were derived. The resulting structure-dependent substitution scores were assessed using a criterion based on the mean ranking of observed substitution among all possible substitutions and in sequence-structure alignments. The automatically built EvDTree substitution scores provide significantly better results than conventional matrices and similar or slightly better results than other structure-dependent matrices. EvDTree has been applied to small disulfide-rich proteins as a test case to automatically derive specific substitutions scores providing better results than non-specific substitution scores. Analyses of the decision tree classifications provide useful information on the relative importance of different structural descriptors. CONCLUSIONS: We propose a fully automatic method for the classification of structural environments and inference of structure-dependent substitution profiles. We show that this approach is more accurate than existing methods for various applications. The easy adaptation of EvDTree to any specific data set opens the way for class-specific structure-dependent substitution scores which can be used in threading-based remote homology searches. BioMed Central 2005-01-10 /pmc/articles/PMC545998/ /pubmed/15638949 http://dx.doi.org/10.1186/1471-2105-6-4 Text en Copyright © 2005 Gelly et al; licensee BioMed Central Ltd. |
spellingShingle | Methodology Article Gelly, Jean-Christophe Chiche, Laurent Gracy, Jérôme EvDTree: structure-dependent substitution profiles based on decision tree classification of 3D environments |
title | EvDTree: structure-dependent substitution profiles based on decision tree classification of 3D environments |
title_full | EvDTree: structure-dependent substitution profiles based on decision tree classification of 3D environments |
title_fullStr | EvDTree: structure-dependent substitution profiles based on decision tree classification of 3D environments |
title_full_unstemmed | EvDTree: structure-dependent substitution profiles based on decision tree classification of 3D environments |
title_short | EvDTree: structure-dependent substitution profiles based on decision tree classification of 3D environments |
title_sort | evdtree: structure-dependent substitution profiles based on decision tree classification of 3d environments |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC545998/ https://www.ncbi.nlm.nih.gov/pubmed/15638949 http://dx.doi.org/10.1186/1471-2105-6-4 |
work_keys_str_mv | AT gellyjeanchristophe evdtreestructuredependentsubstitutionprofilesbasedondecisiontreeclassificationof3denvironments AT chichelaurent evdtreestructuredependentsubstitutionprofilesbasedondecisiontreeclassificationof3denvironments AT gracyjerome evdtreestructuredependentsubstitutionprofilesbasedondecisiontreeclassificationof3denvironments |