Cargando…

Data-intensive analysis of HIV mutations

BACKGROUND: In this study, clustering was performed using a bitmap representation of HIV reverse transcriptase and protease sequences, to produce an unsupervised classification of HIV sequences. The classification will aid our understanding of the interactions between mutations and drug resistance....

Descripción completa

Detalles Bibliográficos
Autores principales: Ozahata, Mina Cintho, Sabino, Ester Cerdeira, Diaz, Ricardo Sobhie, M Cesar-, Roberto, Ferreira, João Eduardo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4344997/
https://www.ncbi.nlm.nih.gov/pubmed/25652056
http://dx.doi.org/10.1186/s12859-015-0452-0
_version_ 1782359508938391552
author Ozahata, Mina Cintho
Sabino, Ester Cerdeira
Diaz, Ricardo Sobhie
M Cesar-, Roberto
Ferreira, João Eduardo
author_facet Ozahata, Mina Cintho
Sabino, Ester Cerdeira
Diaz, Ricardo Sobhie
M Cesar-, Roberto
Ferreira, João Eduardo
author_sort Ozahata, Mina Cintho
collection PubMed
description BACKGROUND: In this study, clustering was performed using a bitmap representation of HIV reverse transcriptase and protease sequences, to produce an unsupervised classification of HIV sequences. The classification will aid our understanding of the interactions between mutations and drug resistance. 10,229 HIV genomic sequences from the protease and reverse transcriptase regions of the pol gene and antiretroviral resistant related mutations represented in an 82-dimensional binary vector space were analyzed. RESULTS: A new cluster representation was proposed using an image inspired by microarray data, such that the rows in the image represented the protein sequences from the genotype data and the columns represented presence or absence of mutations in each protein position.The visualization of the clusters showed that some mutations frequently occur together and are probably related to an epistatic phenomenon. CONCLUSION: We described a methodology based on the application of a pattern recognition algorithm using binary data to suggest clusters of mutations that can easily be discriminated by cluster viewing schemes.
format Online
Article
Text
id pubmed-4344997
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-43449972015-03-02 Data-intensive analysis of HIV mutations Ozahata, Mina Cintho Sabino, Ester Cerdeira Diaz, Ricardo Sobhie M Cesar-, Roberto Ferreira, João Eduardo BMC Bioinformatics Research Article BACKGROUND: In this study, clustering was performed using a bitmap representation of HIV reverse transcriptase and protease sequences, to produce an unsupervised classification of HIV sequences. The classification will aid our understanding of the interactions between mutations and drug resistance. 10,229 HIV genomic sequences from the protease and reverse transcriptase regions of the pol gene and antiretroviral resistant related mutations represented in an 82-dimensional binary vector space were analyzed. RESULTS: A new cluster representation was proposed using an image inspired by microarray data, such that the rows in the image represented the protein sequences from the genotype data and the columns represented presence or absence of mutations in each protein position.The visualization of the clusters showed that some mutations frequently occur together and are probably related to an epistatic phenomenon. CONCLUSION: We described a methodology based on the application of a pattern recognition algorithm using binary data to suggest clusters of mutations that can easily be discriminated by cluster viewing schemes. BioMed Central 2015-02-05 /pmc/articles/PMC4344997/ /pubmed/25652056 http://dx.doi.org/10.1186/s12859-015-0452-0 Text en © Ozahata et al.; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Ozahata, Mina Cintho
Sabino, Ester Cerdeira
Diaz, Ricardo Sobhie
M Cesar-, Roberto
Ferreira, João Eduardo
Data-intensive analysis of HIV mutations
title Data-intensive analysis of HIV mutations
title_full Data-intensive analysis of HIV mutations
title_fullStr Data-intensive analysis of HIV mutations
title_full_unstemmed Data-intensive analysis of HIV mutations
title_short Data-intensive analysis of HIV mutations
title_sort data-intensive analysis of hiv mutations
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4344997/
https://www.ncbi.nlm.nih.gov/pubmed/25652056
http://dx.doi.org/10.1186/s12859-015-0452-0
work_keys_str_mv AT ozahataminacintho dataintensiveanalysisofhivmutations
AT sabinoestercerdeira dataintensiveanalysisofhivmutations
AT diazricardosobhie dataintensiveanalysisofhivmutations
AT mcesarroberto dataintensiveanalysisofhivmutations
AT ferreirajoaoeduardo dataintensiveanalysisofhivmutations