Cargando…

Using genomic signatures for HIV-1 sub-typing

BACKGROUND: Human Immunodeficiency Virus type 1 (HIV-1), the causative agent of Acquired Immune Deficiency Syndrome (AIDS), exhibits very high genetic diversity with different variants or subtypes prevalent in different parts of the world. Proper classification of the HIV-1 subtypes, displaying diff...

Descripción completa

Detalles Bibliográficos
Autores principales: Pandit, Aridaman, Sinha, Somdatta
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3009497/
https://www.ncbi.nlm.nih.gov/pubmed/20122198
http://dx.doi.org/10.1186/1471-2105-11-S1-S26
_version_ 1782194692111204352
author Pandit, Aridaman
Sinha, Somdatta
author_facet Pandit, Aridaman
Sinha, Somdatta
author_sort Pandit, Aridaman
collection PubMed
description BACKGROUND: Human Immunodeficiency Virus type 1 (HIV-1), the causative agent of Acquired Immune Deficiency Syndrome (AIDS), exhibits very high genetic diversity with different variants or subtypes prevalent in different parts of the world. Proper classification of the HIV-1 subtypes, displaying differential infectivity, plays a major role in monitoring the epidemic and is also a critical component for effective treatment strategy. The existing methods to classify HIV-1 sequence subtypes, based on phylogenetic analysis focusing only on specific genes/regions, have shown inconsistencies as they lack the capability to analyse whole genome variations. Several isolates are left unclassified due to unresolved sub-typing. It is apparent that classification of subtypes based on complete genome sequences, rather than sub-genomic regions, is a more robust and comprehensive approach to address genome-wide heterogeneity. However, no simple methodology exists that directly computes HIV-1 subtype from the complete genome sequence. RESULTS: We use Chaos Game Representation (CGR) as an approach to identify the distinctive genomic signature associated with the DNA sequence organisation in different HIV-1 subtypes. We first analysed the effect of nucleotide word lengths (k = 2 to 8) on whole genomes of the HIV-1 M group sequences, and found the optimum word length of k = 6, that could classify HIV-1 subtypes based on a Test sequence set. Using the optimised word length, we then showed accurate classification of the HIV-1 subtypes from both the Reference Set sequences and from all available sequences in the database. Finally, we applied the approach to cluster the five unclassified HIV-1 sequences from Africa and Europe, and predict their possible subtypes. CONCLUSION: We propose a genomic signature-based approach, using CGR with suitable word length optimisation, which can be applied to classify intra-species variations, and apply it to the complex problem of HIV-1 subtype classification. We demonstrate that CGR is a simple and computationally less intensive method that not only accurately segregates the HIV-1 subtype and sub-subtypes, but also aid in the classification of the unclassified sequences. We hope that it will be useful in subtype annotation of the newly sequenced HIV-1 genomes.
format Text
id pubmed-3009497
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30094972010-12-23 Using genomic signatures for HIV-1 sub-typing Pandit, Aridaman Sinha, Somdatta BMC Bioinformatics Research BACKGROUND: Human Immunodeficiency Virus type 1 (HIV-1), the causative agent of Acquired Immune Deficiency Syndrome (AIDS), exhibits very high genetic diversity with different variants or subtypes prevalent in different parts of the world. Proper classification of the HIV-1 subtypes, displaying differential infectivity, plays a major role in monitoring the epidemic and is also a critical component for effective treatment strategy. The existing methods to classify HIV-1 sequence subtypes, based on phylogenetic analysis focusing only on specific genes/regions, have shown inconsistencies as they lack the capability to analyse whole genome variations. Several isolates are left unclassified due to unresolved sub-typing. It is apparent that classification of subtypes based on complete genome sequences, rather than sub-genomic regions, is a more robust and comprehensive approach to address genome-wide heterogeneity. However, no simple methodology exists that directly computes HIV-1 subtype from the complete genome sequence. RESULTS: We use Chaos Game Representation (CGR) as an approach to identify the distinctive genomic signature associated with the DNA sequence organisation in different HIV-1 subtypes. We first analysed the effect of nucleotide word lengths (k = 2 to 8) on whole genomes of the HIV-1 M group sequences, and found the optimum word length of k = 6, that could classify HIV-1 subtypes based on a Test sequence set. Using the optimised word length, we then showed accurate classification of the HIV-1 subtypes from both the Reference Set sequences and from all available sequences in the database. Finally, we applied the approach to cluster the five unclassified HIV-1 sequences from Africa and Europe, and predict their possible subtypes. CONCLUSION: We propose a genomic signature-based approach, using CGR with suitable word length optimisation, which can be applied to classify intra-species variations, and apply it to the complex problem of HIV-1 subtype classification. We demonstrate that CGR is a simple and computationally less intensive method that not only accurately segregates the HIV-1 subtype and sub-subtypes, but also aid in the classification of the unclassified sequences. We hope that it will be useful in subtype annotation of the newly sequenced HIV-1 genomes. BioMed Central 2010-01-18 /pmc/articles/PMC3009497/ /pubmed/20122198 http://dx.doi.org/10.1186/1471-2105-11-S1-S26 Text en Copyright ©2010 Pandit and Sinha; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Pandit, Aridaman
Sinha, Somdatta
Using genomic signatures for HIV-1 sub-typing
title Using genomic signatures for HIV-1 sub-typing
title_full Using genomic signatures for HIV-1 sub-typing
title_fullStr Using genomic signatures for HIV-1 sub-typing
title_full_unstemmed Using genomic signatures for HIV-1 sub-typing
title_short Using genomic signatures for HIV-1 sub-typing
title_sort using genomic signatures for hiv-1 sub-typing
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3009497/
https://www.ncbi.nlm.nih.gov/pubmed/20122198
http://dx.doi.org/10.1186/1471-2105-11-S1-S26
work_keys_str_mv AT panditaridaman usinggenomicsignaturesforhiv1subtyping
AT sinhasomdatta usinggenomicsignaturesforhiv1subtyping