Cargando…
Automated prediction of HIV drug resistance from genotype data
BACKGROUND: HIV/AIDS is a serious threat to public health. The emergence of drug resistance mutations diminishes the effectiveness of drug therapy for HIV/AIDS. Developing a computational prediction of drug resistance phenotype will enable efficient and timely selection of the best treatment regimen...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5009519/ https://www.ncbi.nlm.nih.gov/pubmed/27586700 http://dx.doi.org/10.1186/s12859-016-1114-6 |
_version_ | 1782451527148896256 |
---|---|
author | Shen, ChenHsiang Yu, Xiaxia Harrison, Robert W. Weber, Irene T. |
author_facet | Shen, ChenHsiang Yu, Xiaxia Harrison, Robert W. Weber, Irene T. |
author_sort | Shen, ChenHsiang |
collection | PubMed |
description | BACKGROUND: HIV/AIDS is a serious threat to public health. The emergence of drug resistance mutations diminishes the effectiveness of drug therapy for HIV/AIDS. Developing a computational prediction of drug resistance phenotype will enable efficient and timely selection of the best treatment regimens. RESULTS: A unified encoding of protein sequence and structure was used as the feature vector for predicting phenotypic resistance from genotype data. Two machine learning algorithms, Random Forest and K-nearest neighbor, were used. The prediction accuracies were examined by five-fold cross-validation on the genotype-phenotype datasets. A supervised machine learning approach for automatic prediction of drug resistance was developed to handle genotype-phenotype datasets of HIV protease (PR) and reverse transcriptase (RT). It predicts the drug resistance phenotype and its relative severity from a query sequence. The accuracy of the classification was higher than 0.973 for eight PR inhibitors and 0.986 for ten RT inhibitors, respectively. The overall cross-validated regression R(2)-values for the severity of drug resistance were 0.772–0.953 for 8 PR inhibitors and 0.773–0.995 for 10 RT inhibitors. CONCLUSIONS: Machine learning using a unified encoding of sequence and protein structure as a feature vector provides an accurate prediction of drug resistance from genotype data. A practical webserver for clinicians has been implemented. |
format | Online Article Text |
id | pubmed-5009519 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-50095192016-09-08 Automated prediction of HIV drug resistance from genotype data Shen, ChenHsiang Yu, Xiaxia Harrison, Robert W. Weber, Irene T. BMC Bioinformatics Research BACKGROUND: HIV/AIDS is a serious threat to public health. The emergence of drug resistance mutations diminishes the effectiveness of drug therapy for HIV/AIDS. Developing a computational prediction of drug resistance phenotype will enable efficient and timely selection of the best treatment regimens. RESULTS: A unified encoding of protein sequence and structure was used as the feature vector for predicting phenotypic resistance from genotype data. Two machine learning algorithms, Random Forest and K-nearest neighbor, were used. The prediction accuracies were examined by five-fold cross-validation on the genotype-phenotype datasets. A supervised machine learning approach for automatic prediction of drug resistance was developed to handle genotype-phenotype datasets of HIV protease (PR) and reverse transcriptase (RT). It predicts the drug resistance phenotype and its relative severity from a query sequence. The accuracy of the classification was higher than 0.973 for eight PR inhibitors and 0.986 for ten RT inhibitors, respectively. The overall cross-validated regression R(2)-values for the severity of drug resistance were 0.772–0.953 for 8 PR inhibitors and 0.773–0.995 for 10 RT inhibitors. CONCLUSIONS: Machine learning using a unified encoding of sequence and protein structure as a feature vector provides an accurate prediction of drug resistance from genotype data. A practical webserver for clinicians has been implemented. BioMed Central 2016-08-31 /pmc/articles/PMC5009519/ /pubmed/27586700 http://dx.doi.org/10.1186/s12859-016-1114-6 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Shen, ChenHsiang Yu, Xiaxia Harrison, Robert W. Weber, Irene T. Automated prediction of HIV drug resistance from genotype data |
title | Automated prediction of HIV drug resistance from genotype data |
title_full | Automated prediction of HIV drug resistance from genotype data |
title_fullStr | Automated prediction of HIV drug resistance from genotype data |
title_full_unstemmed | Automated prediction of HIV drug resistance from genotype data |
title_short | Automated prediction of HIV drug resistance from genotype data |
title_sort | automated prediction of hiv drug resistance from genotype data |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5009519/ https://www.ncbi.nlm.nih.gov/pubmed/27586700 http://dx.doi.org/10.1186/s12859-016-1114-6 |
work_keys_str_mv | AT shenchenhsiang automatedpredictionofhivdrugresistancefromgenotypedata AT yuxiaxia automatedpredictionofhivdrugresistancefromgenotypedata AT harrisonrobertw automatedpredictionofhivdrugresistancefromgenotypedata AT weberirenet automatedpredictionofhivdrugresistancefromgenotypedata |