Cargando…

Complete genome assembly data of paenibacillus sp. RUD330, a hypothetical symbiont of euglena gracilis

An unknown bacterial strain was detected in the cytostome of Euglena gracilis and on the cell surface of Euglena gracilis using transmission electron microscopy. To identify the unknown bacterium and its function, we performed isolation experiments. Here we present the genome sequence of the isolate...

Descripción completa

Detalles Bibliográficos
Autores principales: Shtratnikova, Victoria Yu., Rudenskaya, Yulia A., Gerasimov, Evgeny S., Schelkunov, Mikhail I., Logacheva, Maria D., Kolesnikov, Alexander A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7408339/
https://www.ncbi.nlm.nih.gov/pubmed/32793774
http://dx.doi.org/10.1016/j.dib.2020.106070
_version_ 1783567810496036864
author Shtratnikova, Victoria Yu.
Rudenskaya, Yulia A.
Gerasimov, Evgeny S.
Schelkunov, Mikhail I.
Logacheva, Maria D.
Kolesnikov, Alexander A.
author_facet Shtratnikova, Victoria Yu.
Rudenskaya, Yulia A.
Gerasimov, Evgeny S.
Schelkunov, Mikhail I.
Logacheva, Maria D.
Kolesnikov, Alexander A.
author_sort Shtratnikova, Victoria Yu.
collection PubMed
description An unknown bacterial strain was detected in the cytostome of Euglena gracilis and on the cell surface of Euglena gracilis using transmission electron microscopy. To identify the unknown bacterium and its function, we performed isolation experiments. Here we present the genome sequence of the isolate that was determined to be Paenibacillus sp. The genome of the bacterium was sequenced four times using Illumina technology with pair-end reads, Illumina technology with mate pair reads (inserts 3–4 and 6–8 Kb), and Nanopore technology with long reads (tens of thousands of nucleotides). Assemblies based on Illumina reads including mate-pair reads could not resolve issues caused by long tandem copies of rRNA, other tandem repeats, and extremely GC-rich regions (90–100%). Only long Nanopore reads resolved those gaps and made it possible to complete the entire genome; moreover, we found one plasmid. The length of the genome is 5.56 Mbp, and the average GC content is 59%. The genome of Paenibacillus sp. RUD330 included 8 copies of all the rRNA genes (23S; 16S; 5S), the length of the plasmid was 8.3 Kb. We hope that our genome assembly and the methods used can help other investigators in the assembly of complex genomes. Our reliable assembly could be a good basis for further physiological and genetic engineering studies of similar strains.
format Online
Article
Text
id pubmed-7408339
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-74083392020-08-12 Complete genome assembly data of paenibacillus sp. RUD330, a hypothetical symbiont of euglena gracilis Shtratnikova, Victoria Yu. Rudenskaya, Yulia A. Gerasimov, Evgeny S. Schelkunov, Mikhail I. Logacheva, Maria D. Kolesnikov, Alexander A. Data Brief Biochemistry, Genetics and Molecular Biology An unknown bacterial strain was detected in the cytostome of Euglena gracilis and on the cell surface of Euglena gracilis using transmission electron microscopy. To identify the unknown bacterium and its function, we performed isolation experiments. Here we present the genome sequence of the isolate that was determined to be Paenibacillus sp. The genome of the bacterium was sequenced four times using Illumina technology with pair-end reads, Illumina technology with mate pair reads (inserts 3–4 and 6–8 Kb), and Nanopore technology with long reads (tens of thousands of nucleotides). Assemblies based on Illumina reads including mate-pair reads could not resolve issues caused by long tandem copies of rRNA, other tandem repeats, and extremely GC-rich regions (90–100%). Only long Nanopore reads resolved those gaps and made it possible to complete the entire genome; moreover, we found one plasmid. The length of the genome is 5.56 Mbp, and the average GC content is 59%. The genome of Paenibacillus sp. RUD330 included 8 copies of all the rRNA genes (23S; 16S; 5S), the length of the plasmid was 8.3 Kb. We hope that our genome assembly and the methods used can help other investigators in the assembly of complex genomes. Our reliable assembly could be a good basis for further physiological and genetic engineering studies of similar strains. Elsevier 2020-07-25 /pmc/articles/PMC7408339/ /pubmed/32793774 http://dx.doi.org/10.1016/j.dib.2020.106070 Text en © 2020 The Authors http://creativecommons.org/licenses/by/4.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Biochemistry, Genetics and Molecular Biology
Shtratnikova, Victoria Yu.
Rudenskaya, Yulia A.
Gerasimov, Evgeny S.
Schelkunov, Mikhail I.
Logacheva, Maria D.
Kolesnikov, Alexander A.
Complete genome assembly data of paenibacillus sp. RUD330, a hypothetical symbiont of euglena gracilis
title Complete genome assembly data of paenibacillus sp. RUD330, a hypothetical symbiont of euglena gracilis
title_full Complete genome assembly data of paenibacillus sp. RUD330, a hypothetical symbiont of euglena gracilis
title_fullStr Complete genome assembly data of paenibacillus sp. RUD330, a hypothetical symbiont of euglena gracilis
title_full_unstemmed Complete genome assembly data of paenibacillus sp. RUD330, a hypothetical symbiont of euglena gracilis
title_short Complete genome assembly data of paenibacillus sp. RUD330, a hypothetical symbiont of euglena gracilis
title_sort complete genome assembly data of paenibacillus sp. rud330, a hypothetical symbiont of euglena gracilis
topic Biochemistry, Genetics and Molecular Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7408339/
https://www.ncbi.nlm.nih.gov/pubmed/32793774
http://dx.doi.org/10.1016/j.dib.2020.106070
work_keys_str_mv AT shtratnikovavictoriayu completegenomeassemblydataofpaenibacillussprud330ahypotheticalsymbiontofeuglenagracilis
AT rudenskayayuliaa completegenomeassemblydataofpaenibacillussprud330ahypotheticalsymbiontofeuglenagracilis
AT gerasimovevgenys completegenomeassemblydataofpaenibacillussprud330ahypotheticalsymbiontofeuglenagracilis
AT schelkunovmikhaili completegenomeassemblydataofpaenibacillussprud330ahypotheticalsymbiontofeuglenagracilis
AT logachevamariad completegenomeassemblydataofpaenibacillussprud330ahypotheticalsymbiontofeuglenagracilis
AT kolesnikovalexandera completegenomeassemblydataofpaenibacillussprud330ahypotheticalsymbiontofeuglenagracilis