Cargando…

Creating a honey bee consensus gene set

BACKGROUND: We wished to produce a single reference gene set for honey bee (Apis mellifera). Our motivation was twofold. First, we wished to obtain an improved set of gene models with increased coverage of known genes, while maintaining gene model quality. Second, we wished to provide a single offic...

Descripción completa

Detalles Bibliográficos
Autores principales: Elsik, Christine G, Mackey, Aaron J, Reese, Justin T, Milshina, Natalia V, Roos, David S, Weinstock, George M
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1839126/
https://www.ncbi.nlm.nih.gov/pubmed/17241472
http://dx.doi.org/10.1186/gb-2007-8-1-r13
_version_ 1782132859648081920
author Elsik, Christine G
Mackey, Aaron J
Reese, Justin T
Milshina, Natalia V
Roos, David S
Weinstock, George M
author_facet Elsik, Christine G
Mackey, Aaron J
Reese, Justin T
Milshina, Natalia V
Roos, David S
Weinstock, George M
author_sort Elsik, Christine G
collection PubMed
description BACKGROUND: We wished to produce a single reference gene set for honey bee (Apis mellifera). Our motivation was twofold. First, we wished to obtain an improved set of gene models with increased coverage of known genes, while maintaining gene model quality. Second, we wished to provide a single official gene list that the research community could further utilize for consistent and comparable analyses and functional annotation. RESULTS: We created a consensus gene set for honey bee (Apis mellifera) using GLEAN, a new algorithm that uses latent class analysis to automatically combine disparate gene prediction evidence in the absence of known genes. The consensus gene models had increased representation of honey bee genes without sacrificing quality compared with any one of the input gene predictions. When compared with manually annotated gold standards, the consensus set of gene models was similar or superior in quality to each of the input sets. CONCLUSION: Most eukaryotic genome projects produce multiple gene sets because of the variety of gene prediction programs. Each of the gene prediction programs has strengths and weaknesses, and so the multiplicity of gene sets offers users a more comprehensive collection of genes to use than is available from a single program. On the other hand, the availability of multiple gene sets is also a cause for uncertainty among users as regards which set they should use. GLEAN proved to be an effective method to combine gene lists into a single reference set.
format Text
id pubmed-1839126
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-18391262007-04-04 Creating a honey bee consensus gene set Elsik, Christine G Mackey, Aaron J Reese, Justin T Milshina, Natalia V Roos, David S Weinstock, George M Genome Biol Research BACKGROUND: We wished to produce a single reference gene set for honey bee (Apis mellifera). Our motivation was twofold. First, we wished to obtain an improved set of gene models with increased coverage of known genes, while maintaining gene model quality. Second, we wished to provide a single official gene list that the research community could further utilize for consistent and comparable analyses and functional annotation. RESULTS: We created a consensus gene set for honey bee (Apis mellifera) using GLEAN, a new algorithm that uses latent class analysis to automatically combine disparate gene prediction evidence in the absence of known genes. The consensus gene models had increased representation of honey bee genes without sacrificing quality compared with any one of the input gene predictions. When compared with manually annotated gold standards, the consensus set of gene models was similar or superior in quality to each of the input sets. CONCLUSION: Most eukaryotic genome projects produce multiple gene sets because of the variety of gene prediction programs. Each of the gene prediction programs has strengths and weaknesses, and so the multiplicity of gene sets offers users a more comprehensive collection of genes to use than is available from a single program. On the other hand, the availability of multiple gene sets is also a cause for uncertainty among users as regards which set they should use. GLEAN proved to be an effective method to combine gene lists into a single reference set. BioMed Central 2007 2007-01-22 /pmc/articles/PMC1839126/ /pubmed/17241472 http://dx.doi.org/10.1186/gb-2007-8-1-r13 Text en Copyright © 2007 Elsik et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Elsik, Christine G
Mackey, Aaron J
Reese, Justin T
Milshina, Natalia V
Roos, David S
Weinstock, George M
Creating a honey bee consensus gene set
title Creating a honey bee consensus gene set
title_full Creating a honey bee consensus gene set
title_fullStr Creating a honey bee consensus gene set
title_full_unstemmed Creating a honey bee consensus gene set
title_short Creating a honey bee consensus gene set
title_sort creating a honey bee consensus gene set
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1839126/
https://www.ncbi.nlm.nih.gov/pubmed/17241472
http://dx.doi.org/10.1186/gb-2007-8-1-r13
work_keys_str_mv AT elsikchristineg creatingahoneybeeconsensusgeneset
AT mackeyaaronj creatingahoneybeeconsensusgeneset
AT reesejustint creatingahoneybeeconsensusgeneset
AT milshinanataliav creatingahoneybeeconsensusgeneset
AT roosdavids creatingahoneybeeconsensusgeneset
AT weinstockgeorgem creatingahoneybeeconsensusgeneset