Cargando…

New Assembly, Reannotation and Analysis of the Entamoeba histolytica Genome Reveal New Genomic Features and Protein Content Information

BACKGROUND: In order to maintain genome information accurately and relevantly, original genome annotations need to be updated and evaluated regularly. Manual reannotation of genomes is important as it can significantly reduce the propagation of errors and consequently diminishes the time spent on mi...

Descripción completa

Detalles Bibliográficos
Autores principales: Lorenzi, Hernan A., Puiu, Daniela, Miller, Jason R., Brinkac, Lauren M., Amedeo, Paolo, Hall, Neil, Caler, Elisabet V.
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2886108/
https://www.ncbi.nlm.nih.gov/pubmed/20559563
http://dx.doi.org/10.1371/journal.pntd.0000716
_version_ 1782182443626790912
author Lorenzi, Hernan A.
Puiu, Daniela
Miller, Jason R.
Brinkac, Lauren M.
Amedeo, Paolo
Hall, Neil
Caler, Elisabet V.
author_facet Lorenzi, Hernan A.
Puiu, Daniela
Miller, Jason R.
Brinkac, Lauren M.
Amedeo, Paolo
Hall, Neil
Caler, Elisabet V.
author_sort Lorenzi, Hernan A.
collection PubMed
description BACKGROUND: In order to maintain genome information accurately and relevantly, original genome annotations need to be updated and evaluated regularly. Manual reannotation of genomes is important as it can significantly reduce the propagation of errors and consequently diminishes the time spent on mistaken research. For this reason, after five years from the initial submission of the Entamoeba histolytica draft genome publication, we have re-examined the original 23 Mb assembly and the annotation of the predicted genes. PRINCIPAL FINDINGS: The evaluation of the genomic sequence led to the identification of more than one hundred artifactual tandem duplications that were eliminated by re-assembling the genome. The reannotation was done using a combination of manual and automated genome analysis. The new 20 Mb assembly contains 1,496 scaffolds and 8,201 predicted genes, of which 60% are identical to the initial annotation and the remaining 40% underwent structural changes. Functional classification of 60% of the genes was modified based on recent sequence comparisons and new experimental data. We have assigned putative function to 3,788 proteins (46% of the predicted proteome) based on the annotation of predicted gene families, and have identified 58 protein families of five or more members that share no homology with known proteins and thus could be entamoeba specific. Genome analysis also revealed new features such as the presence of segmental duplications of up to 16 kb flanked by inverted repeats, and the tight association of some gene families with transposable elements. SIGNIFICANCE: This new genome annotation and analysis represents a more refined and accurate blueprint of the pathogen genome, and provides an upgraded tool as reference for the study of many important aspects of E. histolytica biology, such as genome evolution and pathogenesis.
format Text
id pubmed-2886108
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-28861082010-06-17 New Assembly, Reannotation and Analysis of the Entamoeba histolytica Genome Reveal New Genomic Features and Protein Content Information Lorenzi, Hernan A. Puiu, Daniela Miller, Jason R. Brinkac, Lauren M. Amedeo, Paolo Hall, Neil Caler, Elisabet V. PLoS Negl Trop Dis Research Article BACKGROUND: In order to maintain genome information accurately and relevantly, original genome annotations need to be updated and evaluated regularly. Manual reannotation of genomes is important as it can significantly reduce the propagation of errors and consequently diminishes the time spent on mistaken research. For this reason, after five years from the initial submission of the Entamoeba histolytica draft genome publication, we have re-examined the original 23 Mb assembly and the annotation of the predicted genes. PRINCIPAL FINDINGS: The evaluation of the genomic sequence led to the identification of more than one hundred artifactual tandem duplications that were eliminated by re-assembling the genome. The reannotation was done using a combination of manual and automated genome analysis. The new 20 Mb assembly contains 1,496 scaffolds and 8,201 predicted genes, of which 60% are identical to the initial annotation and the remaining 40% underwent structural changes. Functional classification of 60% of the genes was modified based on recent sequence comparisons and new experimental data. We have assigned putative function to 3,788 proteins (46% of the predicted proteome) based on the annotation of predicted gene families, and have identified 58 protein families of five or more members that share no homology with known proteins and thus could be entamoeba specific. Genome analysis also revealed new features such as the presence of segmental duplications of up to 16 kb flanked by inverted repeats, and the tight association of some gene families with transposable elements. SIGNIFICANCE: This new genome annotation and analysis represents a more refined and accurate blueprint of the pathogen genome, and provides an upgraded tool as reference for the study of many important aspects of E. histolytica biology, such as genome evolution and pathogenesis. Public Library of Science 2010-06-15 /pmc/articles/PMC2886108/ /pubmed/20559563 http://dx.doi.org/10.1371/journal.pntd.0000716 Text en Lorenzi et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Lorenzi, Hernan A.
Puiu, Daniela
Miller, Jason R.
Brinkac, Lauren M.
Amedeo, Paolo
Hall, Neil
Caler, Elisabet V.
New Assembly, Reannotation and Analysis of the Entamoeba histolytica Genome Reveal New Genomic Features and Protein Content Information
title New Assembly, Reannotation and Analysis of the Entamoeba histolytica Genome Reveal New Genomic Features and Protein Content Information
title_full New Assembly, Reannotation and Analysis of the Entamoeba histolytica Genome Reveal New Genomic Features and Protein Content Information
title_fullStr New Assembly, Reannotation and Analysis of the Entamoeba histolytica Genome Reveal New Genomic Features and Protein Content Information
title_full_unstemmed New Assembly, Reannotation and Analysis of the Entamoeba histolytica Genome Reveal New Genomic Features and Protein Content Information
title_short New Assembly, Reannotation and Analysis of the Entamoeba histolytica Genome Reveal New Genomic Features and Protein Content Information
title_sort new assembly, reannotation and analysis of the entamoeba histolytica genome reveal new genomic features and protein content information
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2886108/
https://www.ncbi.nlm.nih.gov/pubmed/20559563
http://dx.doi.org/10.1371/journal.pntd.0000716
work_keys_str_mv AT lorenzihernana newassemblyreannotationandanalysisoftheentamoebahistolyticagenomerevealnewgenomicfeaturesandproteincontentinformation
AT puiudaniela newassemblyreannotationandanalysisoftheentamoebahistolyticagenomerevealnewgenomicfeaturesandproteincontentinformation
AT millerjasonr newassemblyreannotationandanalysisoftheentamoebahistolyticagenomerevealnewgenomicfeaturesandproteincontentinformation
AT brinkaclaurenm newassemblyreannotationandanalysisoftheentamoebahistolyticagenomerevealnewgenomicfeaturesandproteincontentinformation
AT amedeopaolo newassemblyreannotationandanalysisoftheentamoebahistolyticagenomerevealnewgenomicfeaturesandproteincontentinformation
AT hallneil newassemblyreannotationandanalysisoftheentamoebahistolyticagenomerevealnewgenomicfeaturesandproteincontentinformation
AT calerelisabetv newassemblyreannotationandanalysisoftheentamoebahistolyticagenomerevealnewgenomicfeaturesandproteincontentinformation