Cargando…

Classification of HIV-1 Sequences Using Profile Hidden Markov Models

Accurate classification of HIV-1 subtypes is essential for studying the dynamic spatial distribution pattern of HIV-1 subtypes and also for developing effective methods of treatment that can be targeted to attack specific subtypes. We propose a classification method based on profile Hidden Markov Mo...

Descripción completa

Detalles Bibliográficos
Autores principales: Dwivedi, Sanjiv K., Sengupta, Supratim
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3356369/
https://www.ncbi.nlm.nih.gov/pubmed/22623958
http://dx.doi.org/10.1371/journal.pone.0036566
_version_ 1782233553250025472
author Dwivedi, Sanjiv K.
Sengupta, Supratim
author_facet Dwivedi, Sanjiv K.
Sengupta, Supratim
author_sort Dwivedi, Sanjiv K.
collection PubMed
description Accurate classification of HIV-1 subtypes is essential for studying the dynamic spatial distribution pattern of HIV-1 subtypes and also for developing effective methods of treatment that can be targeted to attack specific subtypes. We propose a classification method based on profile Hidden Markov Model that can accurately identify an unknown strain. We show that a standard method that relies on the construction of a positive training set only, to capture unique features associated with a particular subtype, can accurately classify sequences belonging to all subtypes except B and D. We point out the drawbacks of the standard method; namely, an arbitrary choice of threshold to distinguish between true positives and true negatives, and the inability to discriminate between closely related subtypes. We then propose an improved classification method based on construction of a positive as well as a negative training set to improve discriminating ability between closely related subtypes like B and D. Finally, we show how the improved method can be used to accurately determine the subtype composition of Common Recombinant Forms of the virus that are made up of two or more subtypes. Our method provides a simple and highly accurate alternative to other classification methods and will be useful in accurately annotating newly sequenced HIV-1 strains.
format Online
Article
Text
id pubmed-3356369
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-33563692012-05-23 Classification of HIV-1 Sequences Using Profile Hidden Markov Models Dwivedi, Sanjiv K. Sengupta, Supratim PLoS One Research Article Accurate classification of HIV-1 subtypes is essential for studying the dynamic spatial distribution pattern of HIV-1 subtypes and also for developing effective methods of treatment that can be targeted to attack specific subtypes. We propose a classification method based on profile Hidden Markov Model that can accurately identify an unknown strain. We show that a standard method that relies on the construction of a positive training set only, to capture unique features associated with a particular subtype, can accurately classify sequences belonging to all subtypes except B and D. We point out the drawbacks of the standard method; namely, an arbitrary choice of threshold to distinguish between true positives and true negatives, and the inability to discriminate between closely related subtypes. We then propose an improved classification method based on construction of a positive as well as a negative training set to improve discriminating ability between closely related subtypes like B and D. Finally, we show how the improved method can be used to accurately determine the subtype composition of Common Recombinant Forms of the virus that are made up of two or more subtypes. Our method provides a simple and highly accurate alternative to other classification methods and will be useful in accurately annotating newly sequenced HIV-1 strains. Public Library of Science 2012-05-18 /pmc/articles/PMC3356369/ /pubmed/22623958 http://dx.doi.org/10.1371/journal.pone.0036566 Text en Dwivedi, Sengupta. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Dwivedi, Sanjiv K.
Sengupta, Supratim
Classification of HIV-1 Sequences Using Profile Hidden Markov Models
title Classification of HIV-1 Sequences Using Profile Hidden Markov Models
title_full Classification of HIV-1 Sequences Using Profile Hidden Markov Models
title_fullStr Classification of HIV-1 Sequences Using Profile Hidden Markov Models
title_full_unstemmed Classification of HIV-1 Sequences Using Profile Hidden Markov Models
title_short Classification of HIV-1 Sequences Using Profile Hidden Markov Models
title_sort classification of hiv-1 sequences using profile hidden markov models
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3356369/
https://www.ncbi.nlm.nih.gov/pubmed/22623958
http://dx.doi.org/10.1371/journal.pone.0036566
work_keys_str_mv AT dwivedisanjivk classificationofhiv1sequencesusingprofilehiddenmarkovmodels
AT senguptasupratim classificationofhiv1sequencesusingprofilehiddenmarkovmodels