Cargando…

tRNADB-CE: tRNA gene database well-timed in the era of big sequence data

The tRNA gene data base curated by experts “tRNADB-CE” (http://trna.ie.niigata-u.ac.jp) was constructed by analyzing 1,966 complete and 5,272 draft genomes of prokaryotes, 171 viruses’, 121 chloroplasts’, and 12 eukaryotes’ genomes plus fragment sequences obtained by metagenome studies of environmen...

Descripción completa

Detalles Bibliográficos
Autores principales: Abe, Takashi, Inokuchi, Hachiro, Yamada, Yuko, Muto, Akira, Iwasaki, Yuki, Ikemura, Toshimichi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4013482/
https://www.ncbi.nlm.nih.gov/pubmed/24822057
http://dx.doi.org/10.3389/fgene.2014.00114
_version_ 1782315061365178368
author Abe, Takashi
Inokuchi, Hachiro
Yamada, Yuko
Muto, Akira
Iwasaki, Yuki
Ikemura, Toshimichi
author_facet Abe, Takashi
Inokuchi, Hachiro
Yamada, Yuko
Muto, Akira
Iwasaki, Yuki
Ikemura, Toshimichi
author_sort Abe, Takashi
collection PubMed
description The tRNA gene data base curated by experts “tRNADB-CE” (http://trna.ie.niigata-u.ac.jp) was constructed by analyzing 1,966 complete and 5,272 draft genomes of prokaryotes, 171 viruses’, 121 chloroplasts’, and 12 eukaryotes’ genomes plus fragment sequences obtained by metagenome studies of environmental samples. 595,115 tRNA genes in total, and thus two times of genes compiled previously, have been registered, for which sequence, clover-leaf structure, and results of sequence-similarity and oligonucleotide-pattern searches can be browsed. To provide collective knowledge with help from experts in tRNA researches, we added a column for enregistering comments to each tRNA. By grouping bacterial tRNAs with an identical sequence, we have found high phylogenetic preservation of tRNA sequences, especially at the phylum level. Since many species-unknown tRNAs from metagenomic sequences have sequences identical to those found in species-known prokaryotes, the identical sequence group (ISG) can provide phylogenetic markers to investigate the microbial community in an environmental ecosystem. This strategy can be applied to a huge amount of short sequences obtained from next-generation sequencers, as showing that tRNADB-CE is a well-timed database in the era of big sequence data. It is also discussed that batch-learning self-organizing-map with oligonucleotide composition is useful for efficient knowledge discovery from big sequence data.
format Online
Article
Text
id pubmed-4013482
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-40134822014-05-12 tRNADB-CE: tRNA gene database well-timed in the era of big sequence data Abe, Takashi Inokuchi, Hachiro Yamada, Yuko Muto, Akira Iwasaki, Yuki Ikemura, Toshimichi Front Genet Genetics The tRNA gene data base curated by experts “tRNADB-CE” (http://trna.ie.niigata-u.ac.jp) was constructed by analyzing 1,966 complete and 5,272 draft genomes of prokaryotes, 171 viruses’, 121 chloroplasts’, and 12 eukaryotes’ genomes plus fragment sequences obtained by metagenome studies of environmental samples. 595,115 tRNA genes in total, and thus two times of genes compiled previously, have been registered, for which sequence, clover-leaf structure, and results of sequence-similarity and oligonucleotide-pattern searches can be browsed. To provide collective knowledge with help from experts in tRNA researches, we added a column for enregistering comments to each tRNA. By grouping bacterial tRNAs with an identical sequence, we have found high phylogenetic preservation of tRNA sequences, especially at the phylum level. Since many species-unknown tRNAs from metagenomic sequences have sequences identical to those found in species-known prokaryotes, the identical sequence group (ISG) can provide phylogenetic markers to investigate the microbial community in an environmental ecosystem. This strategy can be applied to a huge amount of short sequences obtained from next-generation sequencers, as showing that tRNADB-CE is a well-timed database in the era of big sequence data. It is also discussed that batch-learning self-organizing-map with oligonucleotide composition is useful for efficient knowledge discovery from big sequence data. Frontiers Media S.A. 2014-05-01 /pmc/articles/PMC4013482/ /pubmed/24822057 http://dx.doi.org/10.3389/fgene.2014.00114 Text en Copyright © 2014 Abe, Inokuchi, Yamada, Muto, Iwasaki and Ikemura. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Abe, Takashi
Inokuchi, Hachiro
Yamada, Yuko
Muto, Akira
Iwasaki, Yuki
Ikemura, Toshimichi
tRNADB-CE: tRNA gene database well-timed in the era of big sequence data
title tRNADB-CE: tRNA gene database well-timed in the era of big sequence data
title_full tRNADB-CE: tRNA gene database well-timed in the era of big sequence data
title_fullStr tRNADB-CE: tRNA gene database well-timed in the era of big sequence data
title_full_unstemmed tRNADB-CE: tRNA gene database well-timed in the era of big sequence data
title_short tRNADB-CE: tRNA gene database well-timed in the era of big sequence data
title_sort trnadb-ce: trna gene database well-timed in the era of big sequence data
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4013482/
https://www.ncbi.nlm.nih.gov/pubmed/24822057
http://dx.doi.org/10.3389/fgene.2014.00114
work_keys_str_mv AT abetakashi trnadbcetrnagenedatabasewelltimedintheeraofbigsequencedata
AT inokuchihachiro trnadbcetrnagenedatabasewelltimedintheeraofbigsequencedata
AT yamadayuko trnadbcetrnagenedatabasewelltimedintheeraofbigsequencedata
AT mutoakira trnadbcetrnagenedatabasewelltimedintheeraofbigsequencedata
AT iwasakiyuki trnadbcetrnagenedatabasewelltimedintheeraofbigsequencedata
AT ikemuratoshimichi trnadbcetrnagenedatabasewelltimedintheeraofbigsequencedata