Cargando…

MetaGene: prokaryotic gene finding from environmental genome shotgun sequences

Exhaustive gene identification is a fundamental goal in all metagenomics projects. However, most metagenomic sequences are unassembled anonymous fragments, and conventional gene-finding methods cannot be applied. We have developed a prokaryotic gene-finding program, MetaGene, which utilizes di-codon...

Descripción completa

Detalles Bibliográficos
Autores principales: Noguchi, Hideki, Park, Jungho, Takagi, Toshihisa
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1636498/
https://www.ncbi.nlm.nih.gov/pubmed/17028096
http://dx.doi.org/10.1093/nar/gkl723
_version_ 1782130761289170944
author Noguchi, Hideki
Park, Jungho
Takagi, Toshihisa
author_facet Noguchi, Hideki
Park, Jungho
Takagi, Toshihisa
author_sort Noguchi, Hideki
collection PubMed
description Exhaustive gene identification is a fundamental goal in all metagenomics projects. However, most metagenomic sequences are unassembled anonymous fragments, and conventional gene-finding methods cannot be applied. We have developed a prokaryotic gene-finding program, MetaGene, which utilizes di-codon frequencies estimated by the GC content of a given sequence with other various measures. MetaGene can predict a whole range of prokaryotic genes based on the anonymous genomic sequences of a few hundred bases, with a sensitivity of 95% and a specificity of 90% for artificial shotgun sequences (700 bp fragments from 12 species). MetaGene has two sets of codon frequency interpolations, one for bacteria and one for archaea, and automatically selects the proper set for a given sequence using the domain classification method we propose. The domain classification works properly, correctly assigning domain information to more than 90% of the artificial shotgun sequences. Applied to the Sargasso Sea dataset, MetaGene predicted almost all of the annotated genes and a notable number of novel genes. MetaGene can be applied to wide variety of metagenomic projects and expands the utility of metagenomics.
format Text
id pubmed-1636498
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-16364982006-11-29 MetaGene: prokaryotic gene finding from environmental genome shotgun sequences Noguchi, Hideki Park, Jungho Takagi, Toshihisa Nucleic Acids Res Computational Biology Exhaustive gene identification is a fundamental goal in all metagenomics projects. However, most metagenomic sequences are unassembled anonymous fragments, and conventional gene-finding methods cannot be applied. We have developed a prokaryotic gene-finding program, MetaGene, which utilizes di-codon frequencies estimated by the GC content of a given sequence with other various measures. MetaGene can predict a whole range of prokaryotic genes based on the anonymous genomic sequences of a few hundred bases, with a sensitivity of 95% and a specificity of 90% for artificial shotgun sequences (700 bp fragments from 12 species). MetaGene has two sets of codon frequency interpolations, one for bacteria and one for archaea, and automatically selects the proper set for a given sequence using the domain classification method we propose. The domain classification works properly, correctly assigning domain information to more than 90% of the artificial shotgun sequences. Applied to the Sargasso Sea dataset, MetaGene predicted almost all of the annotated genes and a notable number of novel genes. MetaGene can be applied to wide variety of metagenomic projects and expands the utility of metagenomics. Oxford University Press 2006-11 2006-10-05 /pmc/articles/PMC1636498/ /pubmed/17028096 http://dx.doi.org/10.1093/nar/gkl723 Text en © 2006 The Author(s)
spellingShingle Computational Biology
Noguchi, Hideki
Park, Jungho
Takagi, Toshihisa
MetaGene: prokaryotic gene finding from environmental genome shotgun sequences
title MetaGene: prokaryotic gene finding from environmental genome shotgun sequences
title_full MetaGene: prokaryotic gene finding from environmental genome shotgun sequences
title_fullStr MetaGene: prokaryotic gene finding from environmental genome shotgun sequences
title_full_unstemmed MetaGene: prokaryotic gene finding from environmental genome shotgun sequences
title_short MetaGene: prokaryotic gene finding from environmental genome shotgun sequences
title_sort metagene: prokaryotic gene finding from environmental genome shotgun sequences
topic Computational Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1636498/
https://www.ncbi.nlm.nih.gov/pubmed/17028096
http://dx.doi.org/10.1093/nar/gkl723
work_keys_str_mv AT noguchihideki metageneprokaryoticgenefindingfromenvironmentalgenomeshotgunsequences
AT parkjungho metageneprokaryoticgenefindingfromenvironmentalgenomeshotgunsequences
AT takagitoshihisa metageneprokaryoticgenefindingfromenvironmentalgenomeshotgunsequences