Cargando…

Dataset of complete genome assembly and analysis of mycobacterium tuberculosis strain SIT745/EAI1-MYS

In this dataset, we report the genome assembly and data analysis of Mycobacterium tuberculosis strain SIT745/EAI1-MYS. Previously, this strain was isolated from a Malaysian patient with extra-pulmonary tuberculosis, and identification of this strain is done by spoligotype patterns with fifteen known...

Descripción completa

Detalles Bibliográficos
Autores principales: Abdullah, Mohammad, Suraiya, Siti, Mohamad, Suharni, Harun, Azian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7339031/
https://www.ncbi.nlm.nih.gov/pubmed/32671154
http://dx.doi.org/10.1016/j.dib.2020.105949
_version_ 1783554809174949888
author Abdullah, Mohammad
Suraiya, Siti
Mohamad, Suharni
Harun, Azian
author_facet Abdullah, Mohammad
Suraiya, Siti
Mohamad, Suharni
Harun, Azian
author_sort Abdullah, Mohammad
collection PubMed
description In this dataset, we report the genome assembly and data analysis of Mycobacterium tuberculosis strain SIT745/EAI1-MYS. Previously, this strain was isolated from a Malaysian patient with extra-pulmonary tuberculosis, and identification of this strain is done by spoligotype patterns with fifteen known Shared International Type (SITs). Further analysis showed that this strain has a remarkable phylogeographical specificity for Malaysia. Based on the National Center for Biotechnology Information (NCBI) nucleotide database information, the complete genome consists of 150 contigs with various sequence lengths and was not assembled. In this assembly, the aforementioned contigs along with reference sequence from Mycobacterium tuberculosis strain H37Rv and Mycobacterium bovis strain AF2122/97 was used for gap closures, were assembled into a single circular chromosome length of approximately 4.42 Mega bases (Mb) with an average GC content of 65.6%. The single circular chromosome was shown to contain 4,009 protein-coding sequences, 3 ribosomal RNAs, 45 transfer RNAs, and 12 superclasses distributed with 277 subsystems which constitute nearly 1900 genes, respectively. The genome information will provide fundamental knowledge of this organism as well as insight for understanding genomic and proteomic profiling, phylogenetic relationship.
format Online
Article
Text
id pubmed-7339031
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-73390312020-07-14 Dataset of complete genome assembly and analysis of mycobacterium tuberculosis strain SIT745/EAI1-MYS Abdullah, Mohammad Suraiya, Siti Mohamad, Suharni Harun, Azian Data Brief Genetics, Genomics and Molecular Biology In this dataset, we report the genome assembly and data analysis of Mycobacterium tuberculosis strain SIT745/EAI1-MYS. Previously, this strain was isolated from a Malaysian patient with extra-pulmonary tuberculosis, and identification of this strain is done by spoligotype patterns with fifteen known Shared International Type (SITs). Further analysis showed that this strain has a remarkable phylogeographical specificity for Malaysia. Based on the National Center for Biotechnology Information (NCBI) nucleotide database information, the complete genome consists of 150 contigs with various sequence lengths and was not assembled. In this assembly, the aforementioned contigs along with reference sequence from Mycobacterium tuberculosis strain H37Rv and Mycobacterium bovis strain AF2122/97 was used for gap closures, were assembled into a single circular chromosome length of approximately 4.42 Mega bases (Mb) with an average GC content of 65.6%. The single circular chromosome was shown to contain 4,009 protein-coding sequences, 3 ribosomal RNAs, 45 transfer RNAs, and 12 superclasses distributed with 277 subsystems which constitute nearly 1900 genes, respectively. The genome information will provide fundamental knowledge of this organism as well as insight for understanding genomic and proteomic profiling, phylogenetic relationship. Elsevier 2020-06-30 /pmc/articles/PMC7339031/ /pubmed/32671154 http://dx.doi.org/10.1016/j.dib.2020.105949 Text en © 2020 The Authors http://creativecommons.org/licenses/by/4.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Genetics, Genomics and Molecular Biology
Abdullah, Mohammad
Suraiya, Siti
Mohamad, Suharni
Harun, Azian
Dataset of complete genome assembly and analysis of mycobacterium tuberculosis strain SIT745/EAI1-MYS
title Dataset of complete genome assembly and analysis of mycobacterium tuberculosis strain SIT745/EAI1-MYS
title_full Dataset of complete genome assembly and analysis of mycobacterium tuberculosis strain SIT745/EAI1-MYS
title_fullStr Dataset of complete genome assembly and analysis of mycobacterium tuberculosis strain SIT745/EAI1-MYS
title_full_unstemmed Dataset of complete genome assembly and analysis of mycobacterium tuberculosis strain SIT745/EAI1-MYS
title_short Dataset of complete genome assembly and analysis of mycobacterium tuberculosis strain SIT745/EAI1-MYS
title_sort dataset of complete genome assembly and analysis of mycobacterium tuberculosis strain sit745/eai1-mys
topic Genetics, Genomics and Molecular Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7339031/
https://www.ncbi.nlm.nih.gov/pubmed/32671154
http://dx.doi.org/10.1016/j.dib.2020.105949
work_keys_str_mv AT abdullahmohammad datasetofcompletegenomeassemblyandanalysisofmycobacteriumtuberculosisstrainsit745eai1mys
AT suraiyasiti datasetofcompletegenomeassemblyandanalysisofmycobacteriumtuberculosisstrainsit745eai1mys
AT mohamadsuharni datasetofcompletegenomeassemblyandanalysisofmycobacteriumtuberculosisstrainsit745eai1mys
AT harunazian datasetofcompletegenomeassemblyandanalysisofmycobacteriumtuberculosisstrainsit745eai1mys