Cargando…

How to find simple and accurate rules for viral protease cleavage specificities

BACKGROUND: Proteases of human pathogens are becoming increasingly important drug targets, hence it is necessary to understand their substrate specificity and to interpret this knowledge in practically useful ways. New methods are being developed that produce large amounts of cleavage information fo...

Descripción completa

Detalles Bibliográficos
Autores principales: Rögnvaldsson, Thorsteinn, Etchells, Terence A, You, Liwen, Garwicz, Daniel, Jarman, Ian, Lisboa, Paulo JG
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2698905/
https://www.ncbi.nlm.nih.gov/pubmed/19445713
http://dx.doi.org/10.1186/1471-2105-10-149
_version_ 1782168434891554816
author Rögnvaldsson, Thorsteinn
Etchells, Terence A
You, Liwen
Garwicz, Daniel
Jarman, Ian
Lisboa, Paulo JG
author_facet Rögnvaldsson, Thorsteinn
Etchells, Terence A
You, Liwen
Garwicz, Daniel
Jarman, Ian
Lisboa, Paulo JG
author_sort Rögnvaldsson, Thorsteinn
collection PubMed
description BACKGROUND: Proteases of human pathogens are becoming increasingly important drug targets, hence it is necessary to understand their substrate specificity and to interpret this knowledge in practically useful ways. New methods are being developed that produce large amounts of cleavage information for individual proteases and some have been applied to extract cleavage rules from data. However, the hitherto proposed methods for extracting rules have been neither easy to understand nor very accurate. To be practically useful, cleavage rules should be accurate, compact, and expressed in an easily understandable way. RESULTS: A new method is presented for producing cleavage rules for viral proteases with seemingly complex cleavage profiles. The method is based on orthogonal search-based rule extraction (OSRE) combined with spectral clustering. It is demonstrated on substrate data sets for human immunodeficiency virus type 1 (HIV-1) protease and hepatitis C (HCV) NS3/4A protease, showing excellent prediction performance for both HIV-1 cleavage and HCV NS3/4A cleavage, agreeing with observed HCV genotype differences. New cleavage rules (consensus sequences) are suggested for HIV-1 and HCV NS3/4A cleavages. The practical usability of the method is also demonstrated by using it to predict the location of an internal cleavage site in the HCV NS3 protease and to correct the location of a previously reported internal cleavage site in the HCV NS3 protease. The method is fast to converge and yields accurate rules, on par with previous results for HIV-1 protease and better than previous state-of-the-art for HCV NS3/4A protease. Moreover, the rules are fewer and simpler than previously obtained with rule extraction methods. CONCLUSION: A rule extraction methodology by searching for multivariate low-order predicates yields results that significantly outperform existing rule bases on out-of-sample data, but are more transparent to expert users. The approach yields rules that are easy to use and useful for interpreting experimental data.
format Text
id pubmed-2698905
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26989052009-06-19 How to find simple and accurate rules for viral protease cleavage specificities Rögnvaldsson, Thorsteinn Etchells, Terence A You, Liwen Garwicz, Daniel Jarman, Ian Lisboa, Paulo JG BMC Bioinformatics Methodology Article BACKGROUND: Proteases of human pathogens are becoming increasingly important drug targets, hence it is necessary to understand their substrate specificity and to interpret this knowledge in practically useful ways. New methods are being developed that produce large amounts of cleavage information for individual proteases and some have been applied to extract cleavage rules from data. However, the hitherto proposed methods for extracting rules have been neither easy to understand nor very accurate. To be practically useful, cleavage rules should be accurate, compact, and expressed in an easily understandable way. RESULTS: A new method is presented for producing cleavage rules for viral proteases with seemingly complex cleavage profiles. The method is based on orthogonal search-based rule extraction (OSRE) combined with spectral clustering. It is demonstrated on substrate data sets for human immunodeficiency virus type 1 (HIV-1) protease and hepatitis C (HCV) NS3/4A protease, showing excellent prediction performance for both HIV-1 cleavage and HCV NS3/4A cleavage, agreeing with observed HCV genotype differences. New cleavage rules (consensus sequences) are suggested for HIV-1 and HCV NS3/4A cleavages. The practical usability of the method is also demonstrated by using it to predict the location of an internal cleavage site in the HCV NS3 protease and to correct the location of a previously reported internal cleavage site in the HCV NS3 protease. The method is fast to converge and yields accurate rules, on par with previous results for HIV-1 protease and better than previous state-of-the-art for HCV NS3/4A protease. Moreover, the rules are fewer and simpler than previously obtained with rule extraction methods. CONCLUSION: A rule extraction methodology by searching for multivariate low-order predicates yields results that significantly outperform existing rule bases on out-of-sample data, but are more transparent to expert users. The approach yields rules that are easy to use and useful for interpreting experimental data. BioMed Central 2009-05-16 /pmc/articles/PMC2698905/ /pubmed/19445713 http://dx.doi.org/10.1186/1471-2105-10-149 Text en Copyright © 2009 Rögnvaldsson et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Rögnvaldsson, Thorsteinn
Etchells, Terence A
You, Liwen
Garwicz, Daniel
Jarman, Ian
Lisboa, Paulo JG
How to find simple and accurate rules for viral protease cleavage specificities
title How to find simple and accurate rules for viral protease cleavage specificities
title_full How to find simple and accurate rules for viral protease cleavage specificities
title_fullStr How to find simple and accurate rules for viral protease cleavage specificities
title_full_unstemmed How to find simple and accurate rules for viral protease cleavage specificities
title_short How to find simple and accurate rules for viral protease cleavage specificities
title_sort how to find simple and accurate rules for viral protease cleavage specificities
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2698905/
https://www.ncbi.nlm.nih.gov/pubmed/19445713
http://dx.doi.org/10.1186/1471-2105-10-149
work_keys_str_mv AT rognvaldssonthorsteinn howtofindsimpleandaccuraterulesforviralproteasecleavagespecificities
AT etchellsterencea howtofindsimpleandaccuraterulesforviralproteasecleavagespecificities
AT youliwen howtofindsimpleandaccuraterulesforviralproteasecleavagespecificities
AT garwiczdaniel howtofindsimpleandaccuraterulesforviralproteasecleavagespecificities
AT jarmanian howtofindsimpleandaccuraterulesforviralproteasecleavagespecificities
AT lisboapaulojg howtofindsimpleandaccuraterulesforviralproteasecleavagespecificities