Cargando…

Anopheles gambiae genome reannotation through synthesis of ab initio and comparative gene prediction algorithms

BACKGROUND: Complete genome annotation is a necessary tool as Anopheles gambiae researchers probe the biology of this potent malaria vector. RESULTS: We reannotate the A. gambiae genome by synthesizing comparative and ab initio sets of predicted coding sequences (CDSs) into a single set using an exo...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Jun, Riehle, Michelle M, Zhang, Yan, Xu, Jiannong, Oduol, Frederick, Gomez, Shawn M, Eiglmeier, Karin, Ueberheide, Beatrix M, Shabanowitz, Jeffrey, Hunt, Donald F, Ribeiro, José MC, Vernick, Kenneth D
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1557760/
https://www.ncbi.nlm.nih.gov/pubmed/16569258
http://dx.doi.org/10.1186/gb-2006-7-3-r24
_version_ 1782129406693605376
author Li, Jun
Riehle, Michelle M
Zhang, Yan
Xu, Jiannong
Oduol, Frederick
Gomez, Shawn M
Eiglmeier, Karin
Ueberheide, Beatrix M
Shabanowitz, Jeffrey
Hunt, Donald F
Ribeiro, José MC
Vernick, Kenneth D
author_facet Li, Jun
Riehle, Michelle M
Zhang, Yan
Xu, Jiannong
Oduol, Frederick
Gomez, Shawn M
Eiglmeier, Karin
Ueberheide, Beatrix M
Shabanowitz, Jeffrey
Hunt, Donald F
Ribeiro, José MC
Vernick, Kenneth D
author_sort Li, Jun
collection PubMed
description BACKGROUND: Complete genome annotation is a necessary tool as Anopheles gambiae researchers probe the biology of this potent malaria vector. RESULTS: We reannotate the A. gambiae genome by synthesizing comparative and ab initio sets of predicted coding sequences (CDSs) into a single set using an exon-gene-union algorithm followed by an open-reading-frame-selection algorithm. The reannotation predicts 20,970 CDSs supported by at least two lines of evidence, and it lowers the proportion of CDSs lacking start and/or stop codons to only approximately 4%. The reannotated CDS set includes a set of 4,681 novel CDSs not represented in the Ensembl annotation but with EST support, and another set of 4,031 Ensembl-supported genes that undergo major structural and, therefore, probably functional changes in the reannotated set. The quality and accuracy of the reannotation was assessed by comparison with end sequences from 20,249 full-length cDNA clones, and evaluation of mass spectrometry peptide hit rates from an A. gambiae shotgun proteomic dataset confirms that the reannotated CDSs offer a high quality protein database for proteomics. We provide a functional proteomics annotation, ReAnoXcel, obtained by analysis of the new CDSs through the AnoXcel pipeline, which allows functional comparisons of the CDS sets within the same bioinformatic platform. CDS data are available for download. CONCLUSION: Comprehensive A. gambiae genome reannotation is achieved through a combination of comparative and ab initio gene prediction algorithms.
format Text
id pubmed-1557760
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-15577602006-09-01 Anopheles gambiae genome reannotation through synthesis of ab initio and comparative gene prediction algorithms Li, Jun Riehle, Michelle M Zhang, Yan Xu, Jiannong Oduol, Frederick Gomez, Shawn M Eiglmeier, Karin Ueberheide, Beatrix M Shabanowitz, Jeffrey Hunt, Donald F Ribeiro, José MC Vernick, Kenneth D Genome Biol Research BACKGROUND: Complete genome annotation is a necessary tool as Anopheles gambiae researchers probe the biology of this potent malaria vector. RESULTS: We reannotate the A. gambiae genome by synthesizing comparative and ab initio sets of predicted coding sequences (CDSs) into a single set using an exon-gene-union algorithm followed by an open-reading-frame-selection algorithm. The reannotation predicts 20,970 CDSs supported by at least two lines of evidence, and it lowers the proportion of CDSs lacking start and/or stop codons to only approximately 4%. The reannotated CDS set includes a set of 4,681 novel CDSs not represented in the Ensembl annotation but with EST support, and another set of 4,031 Ensembl-supported genes that undergo major structural and, therefore, probably functional changes in the reannotated set. The quality and accuracy of the reannotation was assessed by comparison with end sequences from 20,249 full-length cDNA clones, and evaluation of mass spectrometry peptide hit rates from an A. gambiae shotgun proteomic dataset confirms that the reannotated CDSs offer a high quality protein database for proteomics. We provide a functional proteomics annotation, ReAnoXcel, obtained by analysis of the new CDSs through the AnoXcel pipeline, which allows functional comparisons of the CDS sets within the same bioinformatic platform. CDS data are available for download. CONCLUSION: Comprehensive A. gambiae genome reannotation is achieved through a combination of comparative and ab initio gene prediction algorithms. BioMed Central 2006 2006-03-27 /pmc/articles/PMC1557760/ /pubmed/16569258 http://dx.doi.org/10.1186/gb-2006-7-3-r24 Text en Copyright © 2006 Li et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Li, Jun
Riehle, Michelle M
Zhang, Yan
Xu, Jiannong
Oduol, Frederick
Gomez, Shawn M
Eiglmeier, Karin
Ueberheide, Beatrix M
Shabanowitz, Jeffrey
Hunt, Donald F
Ribeiro, José MC
Vernick, Kenneth D
Anopheles gambiae genome reannotation through synthesis of ab initio and comparative gene prediction algorithms
title Anopheles gambiae genome reannotation through synthesis of ab initio and comparative gene prediction algorithms
title_full Anopheles gambiae genome reannotation through synthesis of ab initio and comparative gene prediction algorithms
title_fullStr Anopheles gambiae genome reannotation through synthesis of ab initio and comparative gene prediction algorithms
title_full_unstemmed Anopheles gambiae genome reannotation through synthesis of ab initio and comparative gene prediction algorithms
title_short Anopheles gambiae genome reannotation through synthesis of ab initio and comparative gene prediction algorithms
title_sort anopheles gambiae genome reannotation through synthesis of ab initio and comparative gene prediction algorithms
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1557760/
https://www.ncbi.nlm.nih.gov/pubmed/16569258
http://dx.doi.org/10.1186/gb-2006-7-3-r24
work_keys_str_mv AT lijun anophelesgambiaegenomereannotationthroughsynthesisofabinitioandcomparativegenepredictionalgorithms
AT riehlemichellem anophelesgambiaegenomereannotationthroughsynthesisofabinitioandcomparativegenepredictionalgorithms
AT zhangyan anophelesgambiaegenomereannotationthroughsynthesisofabinitioandcomparativegenepredictionalgorithms
AT xujiannong anophelesgambiaegenomereannotationthroughsynthesisofabinitioandcomparativegenepredictionalgorithms
AT oduolfrederick anophelesgambiaegenomereannotationthroughsynthesisofabinitioandcomparativegenepredictionalgorithms
AT gomezshawnm anophelesgambiaegenomereannotationthroughsynthesisofabinitioandcomparativegenepredictionalgorithms
AT eiglmeierkarin anophelesgambiaegenomereannotationthroughsynthesisofabinitioandcomparativegenepredictionalgorithms
AT ueberheidebeatrixm anophelesgambiaegenomereannotationthroughsynthesisofabinitioandcomparativegenepredictionalgorithms
AT shabanowitzjeffrey anophelesgambiaegenomereannotationthroughsynthesisofabinitioandcomparativegenepredictionalgorithms
AT huntdonaldf anophelesgambiaegenomereannotationthroughsynthesisofabinitioandcomparativegenepredictionalgorithms
AT ribeirojosemc anophelesgambiaegenomereannotationthroughsynthesisofabinitioandcomparativegenepredictionalgorithms
AT vernickkennethd anophelesgambiaegenomereannotationthroughsynthesisofabinitioandcomparativegenepredictionalgorithms