Cargando…

A Polyglot Approach to Bioinformatics Data Integration: A Phylogenetic Analysis of HIV-1

As sequencing technologies continue to drop in price and increase in throughput, new challenges emerge for the management and accessibility of genomic sequence data. We have developed a pipeline for facilitating the storage, retrieval, and subsequent analysis of molecular data, integrating both sequ...

Descripción completa

Detalles Bibliográficos
Autores principales: Reisman, Steven, Hatzopoulos, Thomas, Läufer, Konstantin, Thiruvathukal, George K., Putonti, Catherine
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Libertas Academica 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4718148/
https://www.ncbi.nlm.nih.gov/pubmed/26819543
http://dx.doi.org/10.4137/EBO.S32757
_version_ 1782410749322199040
author Reisman, Steven
Hatzopoulos, Thomas
Läufer, Konstantin
Thiruvathukal, George K.
Putonti, Catherine
author_facet Reisman, Steven
Hatzopoulos, Thomas
Läufer, Konstantin
Thiruvathukal, George K.
Putonti, Catherine
author_sort Reisman, Steven
collection PubMed
description As sequencing technologies continue to drop in price and increase in throughput, new challenges emerge for the management and accessibility of genomic sequence data. We have developed a pipeline for facilitating the storage, retrieval, and subsequent analysis of molecular data, integrating both sequence and metadata. Taking a polyglot approach involving multiple languages, libraries, and persistence mechanisms, sequence data can be aggregated from publicly available and local repositories. Data are exposed in the form of a RESTful web service, formatted for easy querying, and retrieved for downstream analyses. As a proof of concept, we have developed a resource for annotated HIV-1 sequences. Phylogenetic analyses were conducted for >6,000 HIV-1 sequences revealing spatial and temporal factors influence the evolution of the individual genes uniquely. Nevertheless, signatures of origin can be extrapolated even despite increased globalization. The approach developed here can easily be customized for any species of interest.
format Online
Article
Text
id pubmed-4718148
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Libertas Academica
record_format MEDLINE/PubMed
spelling pubmed-47181482016-01-27 A Polyglot Approach to Bioinformatics Data Integration: A Phylogenetic Analysis of HIV-1 Reisman, Steven Hatzopoulos, Thomas Läufer, Konstantin Thiruvathukal, George K. Putonti, Catherine Evol Bioinform Online Original Research As sequencing technologies continue to drop in price and increase in throughput, new challenges emerge for the management and accessibility of genomic sequence data. We have developed a pipeline for facilitating the storage, retrieval, and subsequent analysis of molecular data, integrating both sequence and metadata. Taking a polyglot approach involving multiple languages, libraries, and persistence mechanisms, sequence data can be aggregated from publicly available and local repositories. Data are exposed in the form of a RESTful web service, formatted for easy querying, and retrieved for downstream analyses. As a proof of concept, we have developed a resource for annotated HIV-1 sequences. Phylogenetic analyses were conducted for >6,000 HIV-1 sequences revealing spatial and temporal factors influence the evolution of the individual genes uniquely. Nevertheless, signatures of origin can be extrapolated even despite increased globalization. The approach developed here can easily be customized for any species of interest. Libertas Academica 2016-01-18 /pmc/articles/PMC4718148/ /pubmed/26819543 http://dx.doi.org/10.4137/EBO.S32757 Text en © 2016 the author(s), publisher and licensee Libertas Academica Ltd. This is an open-access article distributed under the terms of the Creative Commons CC-BY-NC 3.0 License.
spellingShingle Original Research
Reisman, Steven
Hatzopoulos, Thomas
Läufer, Konstantin
Thiruvathukal, George K.
Putonti, Catherine
A Polyglot Approach to Bioinformatics Data Integration: A Phylogenetic Analysis of HIV-1
title A Polyglot Approach to Bioinformatics Data Integration: A Phylogenetic Analysis of HIV-1
title_full A Polyglot Approach to Bioinformatics Data Integration: A Phylogenetic Analysis of HIV-1
title_fullStr A Polyglot Approach to Bioinformatics Data Integration: A Phylogenetic Analysis of HIV-1
title_full_unstemmed A Polyglot Approach to Bioinformatics Data Integration: A Phylogenetic Analysis of HIV-1
title_short A Polyglot Approach to Bioinformatics Data Integration: A Phylogenetic Analysis of HIV-1
title_sort polyglot approach to bioinformatics data integration: a phylogenetic analysis of hiv-1
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4718148/
https://www.ncbi.nlm.nih.gov/pubmed/26819543
http://dx.doi.org/10.4137/EBO.S32757
work_keys_str_mv AT reismansteven apolyglotapproachtobioinformaticsdataintegrationaphylogeneticanalysisofhiv1
AT hatzopoulosthomas apolyglotapproachtobioinformaticsdataintegrationaphylogeneticanalysisofhiv1
AT lauferkonstantin apolyglotapproachtobioinformaticsdataintegrationaphylogeneticanalysisofhiv1
AT thiruvathukalgeorgek apolyglotapproachtobioinformaticsdataintegrationaphylogeneticanalysisofhiv1
AT putonticatherine apolyglotapproachtobioinformaticsdataintegrationaphylogeneticanalysisofhiv1
AT reismansteven polyglotapproachtobioinformaticsdataintegrationaphylogeneticanalysisofhiv1
AT hatzopoulosthomas polyglotapproachtobioinformaticsdataintegrationaphylogeneticanalysisofhiv1
AT lauferkonstantin polyglotapproachtobioinformaticsdataintegrationaphylogeneticanalysisofhiv1
AT thiruvathukalgeorgek polyglotapproachtobioinformaticsdataintegrationaphylogeneticanalysisofhiv1
AT putonticatherine polyglotapproachtobioinformaticsdataintegrationaphylogeneticanalysisofhiv1