Cargando…

Automated prediction of HIV drug resistance from genotype data

BACKGROUND: HIV/AIDS is a serious threat to public health. The emergence of drug resistance mutations diminishes the effectiveness of drug therapy for HIV/AIDS. Developing a computational prediction of drug resistance phenotype will enable efficient and timely selection of the best treatment regimen...

Descripción completa

Detalles Bibliográficos
Autores principales: Shen, ChenHsiang, Yu, Xiaxia, Harrison, Robert W., Weber, Irene T.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5009519/
https://www.ncbi.nlm.nih.gov/pubmed/27586700
http://dx.doi.org/10.1186/s12859-016-1114-6
_version_ 1782451527148896256
author Shen, ChenHsiang
Yu, Xiaxia
Harrison, Robert W.
Weber, Irene T.
author_facet Shen, ChenHsiang
Yu, Xiaxia
Harrison, Robert W.
Weber, Irene T.
author_sort Shen, ChenHsiang
collection PubMed
description BACKGROUND: HIV/AIDS is a serious threat to public health. The emergence of drug resistance mutations diminishes the effectiveness of drug therapy for HIV/AIDS. Developing a computational prediction of drug resistance phenotype will enable efficient and timely selection of the best treatment regimens. RESULTS: A unified encoding of protein sequence and structure was used as the feature vector for predicting phenotypic resistance from genotype data. Two machine learning algorithms, Random Forest and K-nearest neighbor, were used. The prediction accuracies were examined by five-fold cross-validation on the genotype-phenotype datasets. A supervised machine learning approach for automatic prediction of drug resistance was developed to handle genotype-phenotype datasets of HIV protease (PR) and reverse transcriptase (RT). It predicts the drug resistance phenotype and its relative severity from a query sequence. The accuracy of the classification was higher than 0.973 for eight PR inhibitors and 0.986 for ten RT inhibitors, respectively. The overall cross-validated regression R(2)-values for the severity of drug resistance were 0.772–0.953 for 8 PR inhibitors and 0.773–0.995 for 10 RT inhibitors. CONCLUSIONS: Machine learning using a unified encoding of sequence and protein structure as a feature vector provides an accurate prediction of drug resistance from genotype data. A practical webserver for clinicians has been implemented.
format Online
Article
Text
id pubmed-5009519
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-50095192016-09-08 Automated prediction of HIV drug resistance from genotype data Shen, ChenHsiang Yu, Xiaxia Harrison, Robert W. Weber, Irene T. BMC Bioinformatics Research BACKGROUND: HIV/AIDS is a serious threat to public health. The emergence of drug resistance mutations diminishes the effectiveness of drug therapy for HIV/AIDS. Developing a computational prediction of drug resistance phenotype will enable efficient and timely selection of the best treatment regimens. RESULTS: A unified encoding of protein sequence and structure was used as the feature vector for predicting phenotypic resistance from genotype data. Two machine learning algorithms, Random Forest and K-nearest neighbor, were used. The prediction accuracies were examined by five-fold cross-validation on the genotype-phenotype datasets. A supervised machine learning approach for automatic prediction of drug resistance was developed to handle genotype-phenotype datasets of HIV protease (PR) and reverse transcriptase (RT). It predicts the drug resistance phenotype and its relative severity from a query sequence. The accuracy of the classification was higher than 0.973 for eight PR inhibitors and 0.986 for ten RT inhibitors, respectively. The overall cross-validated regression R(2)-values for the severity of drug resistance were 0.772–0.953 for 8 PR inhibitors and 0.773–0.995 for 10 RT inhibitors. CONCLUSIONS: Machine learning using a unified encoding of sequence and protein structure as a feature vector provides an accurate prediction of drug resistance from genotype data. A practical webserver for clinicians has been implemented. BioMed Central 2016-08-31 /pmc/articles/PMC5009519/ /pubmed/27586700 http://dx.doi.org/10.1186/s12859-016-1114-6 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Shen, ChenHsiang
Yu, Xiaxia
Harrison, Robert W.
Weber, Irene T.
Automated prediction of HIV drug resistance from genotype data
title Automated prediction of HIV drug resistance from genotype data
title_full Automated prediction of HIV drug resistance from genotype data
title_fullStr Automated prediction of HIV drug resistance from genotype data
title_full_unstemmed Automated prediction of HIV drug resistance from genotype data
title_short Automated prediction of HIV drug resistance from genotype data
title_sort automated prediction of hiv drug resistance from genotype data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5009519/
https://www.ncbi.nlm.nih.gov/pubmed/27586700
http://dx.doi.org/10.1186/s12859-016-1114-6
work_keys_str_mv AT shenchenhsiang automatedpredictionofhivdrugresistancefromgenotypedata
AT yuxiaxia automatedpredictionofhivdrugresistancefromgenotypedata
AT harrisonrobertw automatedpredictionofhivdrugresistancefromgenotypedata
AT weberirenet automatedpredictionofhivdrugresistancefromgenotypedata