Cargando…
Geographically-stratified HIV-1 group M pol subtype and circulating recombinant form sequences
Accurate classification of HIV-1 group M lineages, henceforth referred to as subtyping, is essential for understanding global HIV-1 molecular epidemiology. Because most HIV-1 sequencing is done for genotypic resistance testing pol gene, we sought to develop a set of geographically-stratified pol seq...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6067049/ https://www.ncbi.nlm.nih.gov/pubmed/30063225 http://dx.doi.org/10.1038/sdata.2018.148 |
_version_ | 1783343082519920640 |
---|---|
author | Rhee, Soo-Yon Shafer, Robert W. |
author_facet | Rhee, Soo-Yon Shafer, Robert W. |
author_sort | Rhee, Soo-Yon |
collection | PubMed |
description | Accurate classification of HIV-1 group M lineages, henceforth referred to as subtyping, is essential for understanding global HIV-1 molecular epidemiology. Because most HIV-1 sequencing is done for genotypic resistance testing pol gene, we sought to develop a set of geographically-stratified pol sequences that represent HIV-1 group M sequence diversity. Representative pol sequences differ from representative complete genome sequences because not all CRFs have pol recombination points and because complete genome sequences may not faithfully reflect HIV-1 pol diversity. We developed a software pipeline that compiled 6,034 one-per-person complete HIV-1 pol sequences annotated by country and year belonging to 11 pure subtypes and 70 CRFs and selected a set of sequences whose average distance to the remaining sequences is minimized for each subtype/CRF and country to generate a Geographically-Stratified set of 716 Pol Subtype/CRF (GSPS) reference sequences. We provide extensive data on pol diversity within each subtype/CRF and country combination. The GSPS reference set will also be useful for HIV-1 pol subtyping. |
format | Online Article Text |
id | pubmed-6067049 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Nature Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-60670492018-08-10 Geographically-stratified HIV-1 group M pol subtype and circulating recombinant form sequences Rhee, Soo-Yon Shafer, Robert W. Sci Data Data Descriptor Accurate classification of HIV-1 group M lineages, henceforth referred to as subtyping, is essential for understanding global HIV-1 molecular epidemiology. Because most HIV-1 sequencing is done for genotypic resistance testing pol gene, we sought to develop a set of geographically-stratified pol sequences that represent HIV-1 group M sequence diversity. Representative pol sequences differ from representative complete genome sequences because not all CRFs have pol recombination points and because complete genome sequences may not faithfully reflect HIV-1 pol diversity. We developed a software pipeline that compiled 6,034 one-per-person complete HIV-1 pol sequences annotated by country and year belonging to 11 pure subtypes and 70 CRFs and selected a set of sequences whose average distance to the remaining sequences is minimized for each subtype/CRF and country to generate a Geographically-Stratified set of 716 Pol Subtype/CRF (GSPS) reference sequences. We provide extensive data on pol diversity within each subtype/CRF and country combination. The GSPS reference set will also be useful for HIV-1 pol subtyping. Nature Publishing Group 2018-07-31 /pmc/articles/PMC6067049/ /pubmed/30063225 http://dx.doi.org/10.1038/sdata.2018.148 Text en Copyright © 2018, The Author(s) http://creativecommons.org/licenses/by/4.0/ Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files made available in this article. |
spellingShingle | Data Descriptor Rhee, Soo-Yon Shafer, Robert W. Geographically-stratified HIV-1 group M pol subtype and circulating recombinant form sequences |
title | Geographically-stratified HIV-1 group M pol subtype and circulating recombinant form sequences |
title_full | Geographically-stratified HIV-1 group M pol subtype and circulating recombinant form sequences |
title_fullStr | Geographically-stratified HIV-1 group M pol subtype and circulating recombinant form sequences |
title_full_unstemmed | Geographically-stratified HIV-1 group M pol subtype and circulating recombinant form sequences |
title_short | Geographically-stratified HIV-1 group M pol subtype and circulating recombinant form sequences |
title_sort | geographically-stratified hiv-1 group m pol subtype and circulating recombinant form sequences |
topic | Data Descriptor |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6067049/ https://www.ncbi.nlm.nih.gov/pubmed/30063225 http://dx.doi.org/10.1038/sdata.2018.148 |
work_keys_str_mv | AT rheesooyon geographicallystratifiedhiv1groupmpolsubtypeandcirculatingrecombinantformsequences AT shaferrobertw geographicallystratifiedhiv1groupmpolsubtypeandcirculatingrecombinantformsequences |