Cargando…
Bioinformatic curation and alignment of genotyped hepatitis B virus (HBV) sequence data from the GenBank public database
BACKGROUND: Hepatitis B virus (HBV) DNA sequence data from thousands of samples are present in the public sequence databases. No publicly available, up-to-date, multiple sequence alignments, containing full-length and subgenomic fragments per genotype, are available. Such alignments are useful in ma...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer International Publishing
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5084120/ https://www.ncbi.nlm.nih.gov/pubmed/27843753 http://dx.doi.org/10.1186/s40064-016-3312-0 |
_version_ | 1782463344563716096 |
---|---|
author | Bell, Trevor G. Yousif, Mukhlid Kramvis, Anna |
author_facet | Bell, Trevor G. Yousif, Mukhlid Kramvis, Anna |
author_sort | Bell, Trevor G. |
collection | PubMed |
description | BACKGROUND: Hepatitis B virus (HBV) DNA sequence data from thousands of samples are present in the public sequence databases. No publicly available, up-to-date, multiple sequence alignments, containing full-length and subgenomic fragments per genotype, are available. Such alignments are useful in many analysis applications, including data-mining and phylogenetic analyses. RESULTS: By issuing a query, all HBV sequence data from the GenBank public database was downloaded (67,893 sequences). Full-length and subgenomic sequences, which were genotyped by the submitters (30,852 sequences), were placed into a multiple sequence alignment, for each genotype (genotype A: 5868 sequences, B: 4630, C: 7820, D: 8300, E: 2043, F: 985, G: 189, H: 108, I: 23), according to the results of offline BLAST searches against a custom reference library of full-length sequences. Further curation was performed to improve the alignment. CONCLUSIONS: The algorithm described in this paper generates, for each of the nine HBV genotypes, multiple sequence alignments, which contain full-length and subgenomic fragments. The alignments can be updated as new sequences become available in the online public sequence databases. The alignments are available at http://hvdr.bioinf.wits.ac.za/alignments. |
format | Online Article Text |
id | pubmed-5084120 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Springer International Publishing |
record_format | MEDLINE/PubMed |
spelling | pubmed-50841202016-11-14 Bioinformatic curation and alignment of genotyped hepatitis B virus (HBV) sequence data from the GenBank public database Bell, Trevor G. Yousif, Mukhlid Kramvis, Anna Springerplus Methodology BACKGROUND: Hepatitis B virus (HBV) DNA sequence data from thousands of samples are present in the public sequence databases. No publicly available, up-to-date, multiple sequence alignments, containing full-length and subgenomic fragments per genotype, are available. Such alignments are useful in many analysis applications, including data-mining and phylogenetic analyses. RESULTS: By issuing a query, all HBV sequence data from the GenBank public database was downloaded (67,893 sequences). Full-length and subgenomic sequences, which were genotyped by the submitters (30,852 sequences), were placed into a multiple sequence alignment, for each genotype (genotype A: 5868 sequences, B: 4630, C: 7820, D: 8300, E: 2043, F: 985, G: 189, H: 108, I: 23), according to the results of offline BLAST searches against a custom reference library of full-length sequences. Further curation was performed to improve the alignment. CONCLUSIONS: The algorithm described in this paper generates, for each of the nine HBV genotypes, multiple sequence alignments, which contain full-length and subgenomic fragments. The alignments can be updated as new sequences become available in the online public sequence databases. The alignments are available at http://hvdr.bioinf.wits.ac.za/alignments. Springer International Publishing 2016-10-28 /pmc/articles/PMC5084120/ /pubmed/27843753 http://dx.doi.org/10.1186/s40064-016-3312-0 Text en © The Author(s) 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. |
spellingShingle | Methodology Bell, Trevor G. Yousif, Mukhlid Kramvis, Anna Bioinformatic curation and alignment of genotyped hepatitis B virus (HBV) sequence data from the GenBank public database |
title | Bioinformatic curation and alignment of genotyped hepatitis B virus (HBV) sequence data from the GenBank public database |
title_full | Bioinformatic curation and alignment of genotyped hepatitis B virus (HBV) sequence data from the GenBank public database |
title_fullStr | Bioinformatic curation and alignment of genotyped hepatitis B virus (HBV) sequence data from the GenBank public database |
title_full_unstemmed | Bioinformatic curation and alignment of genotyped hepatitis B virus (HBV) sequence data from the GenBank public database |
title_short | Bioinformatic curation and alignment of genotyped hepatitis B virus (HBV) sequence data from the GenBank public database |
title_sort | bioinformatic curation and alignment of genotyped hepatitis b virus (hbv) sequence data from the genbank public database |
topic | Methodology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5084120/ https://www.ncbi.nlm.nih.gov/pubmed/27843753 http://dx.doi.org/10.1186/s40064-016-3312-0 |
work_keys_str_mv | AT belltrevorg bioinformaticcurationandalignmentofgenotypedhepatitisbvirushbvsequencedatafromthegenbankpublicdatabase AT yousifmukhlid bioinformaticcurationandalignmentofgenotypedhepatitisbvirushbvsequencedatafromthegenbankpublicdatabase AT kramvisanna bioinformaticcurationandalignmentofgenotypedhepatitisbvirushbvsequencedatafromthegenbankpublicdatabase |