Cargando…

Deep landscape update of dispersed and tandem repeats in the genome model of the red jungle fowl, Gallus gallus, using a series of de novo investigating tools

BACKGROUND: The program RepeatMasker and the database Repbase-ISB are part of the most widely used strategy for annotating repeats in animal genomes. They have been used to show that avian genomes have a lower repeat content (8–12 %) than the sequenced genomes of many vertebrate species (30–55 %). H...

Descripción completa

Detalles Bibliográficos
Autores principales: Guizard, Sébastien, Piégu, Benoît, Arensburger, Peter, Guillou, Florian, Bigot, Yves
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4992247/
https://www.ncbi.nlm.nih.gov/pubmed/27542599
http://dx.doi.org/10.1186/s12864-016-3015-5
_version_ 1782448983015161856
author Guizard, Sébastien
Piégu, Benoît
Arensburger, Peter
Guillou, Florian
Bigot, Yves
author_facet Guizard, Sébastien
Piégu, Benoît
Arensburger, Peter
Guillou, Florian
Bigot, Yves
author_sort Guizard, Sébastien
collection PubMed
description BACKGROUND: The program RepeatMasker and the database Repbase-ISB are part of the most widely used strategy for annotating repeats in animal genomes. They have been used to show that avian genomes have a lower repeat content (8–12 %) than the sequenced genomes of many vertebrate species (30–55 %). However, the efficiency of such a library-based strategies is dependent on the quality and completeness of the sequences in the database that is used. An alternative to these library based methods are methods that identify repeats de novo. These alternative methods have existed for a least a decade and may be more powerful than the library based methods. We have used an annotation strategy involving several complementary de novo tools to determine the repeat content of the model genome galGal4 (1.04 Gbp), including identifying simple sequence repeats (SSRs), tandem repeats and transposable elements (TEs). RESULTS: We annotated over one Gbp. of the galGal4 genome and showed that it is composed of approximately 19 % SSRs and TEs repeats. Furthermore, we estimate that the actual genome of the red jungle fowl contains about 31–35 % repeats. We find that library-based methods tend to overestimate TE diversity. These results have a major impact on the current understanding of repeats distributions throughout chromosomes in the red jungle fowl. CONCLUSIONS: Our results are a proof of concept of the reliability of using de novo tools to annotate repeats in large animal genomes. They have also revealed issues that will need to be resolved in order to develop gold-standard methodologies for annotating repeats in eukaryote genomes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-3015-5) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4992247
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-49922472016-08-21 Deep landscape update of dispersed and tandem repeats in the genome model of the red jungle fowl, Gallus gallus, using a series of de novo investigating tools Guizard, Sébastien Piégu, Benoît Arensburger, Peter Guillou, Florian Bigot, Yves BMC Genomics Research Article BACKGROUND: The program RepeatMasker and the database Repbase-ISB are part of the most widely used strategy for annotating repeats in animal genomes. They have been used to show that avian genomes have a lower repeat content (8–12 %) than the sequenced genomes of many vertebrate species (30–55 %). However, the efficiency of such a library-based strategies is dependent on the quality and completeness of the sequences in the database that is used. An alternative to these library based methods are methods that identify repeats de novo. These alternative methods have existed for a least a decade and may be more powerful than the library based methods. We have used an annotation strategy involving several complementary de novo tools to determine the repeat content of the model genome galGal4 (1.04 Gbp), including identifying simple sequence repeats (SSRs), tandem repeats and transposable elements (TEs). RESULTS: We annotated over one Gbp. of the galGal4 genome and showed that it is composed of approximately 19 % SSRs and TEs repeats. Furthermore, we estimate that the actual genome of the red jungle fowl contains about 31–35 % repeats. We find that library-based methods tend to overestimate TE diversity. These results have a major impact on the current understanding of repeats distributions throughout chromosomes in the red jungle fowl. CONCLUSIONS: Our results are a proof of concept of the reliability of using de novo tools to annotate repeats in large animal genomes. They have also revealed issues that will need to be resolved in order to develop gold-standard methodologies for annotating repeats in eukaryote genomes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-3015-5) contains supplementary material, which is available to authorized users. BioMed Central 2016-08-19 /pmc/articles/PMC4992247/ /pubmed/27542599 http://dx.doi.org/10.1186/s12864-016-3015-5 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Guizard, Sébastien
Piégu, Benoît
Arensburger, Peter
Guillou, Florian
Bigot, Yves
Deep landscape update of dispersed and tandem repeats in the genome model of the red jungle fowl, Gallus gallus, using a series of de novo investigating tools
title Deep landscape update of dispersed and tandem repeats in the genome model of the red jungle fowl, Gallus gallus, using a series of de novo investigating tools
title_full Deep landscape update of dispersed and tandem repeats in the genome model of the red jungle fowl, Gallus gallus, using a series of de novo investigating tools
title_fullStr Deep landscape update of dispersed and tandem repeats in the genome model of the red jungle fowl, Gallus gallus, using a series of de novo investigating tools
title_full_unstemmed Deep landscape update of dispersed and tandem repeats in the genome model of the red jungle fowl, Gallus gallus, using a series of de novo investigating tools
title_short Deep landscape update of dispersed and tandem repeats in the genome model of the red jungle fowl, Gallus gallus, using a series of de novo investigating tools
title_sort deep landscape update of dispersed and tandem repeats in the genome model of the red jungle fowl, gallus gallus, using a series of de novo investigating tools
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4992247/
https://www.ncbi.nlm.nih.gov/pubmed/27542599
http://dx.doi.org/10.1186/s12864-016-3015-5
work_keys_str_mv AT guizardsebastien deeplandscapeupdateofdispersedandtandemrepeatsinthegenomemodeloftheredjunglefowlgallusgallususingaseriesofdenovoinvestigatingtools
AT piegubenoit deeplandscapeupdateofdispersedandtandemrepeatsinthegenomemodeloftheredjunglefowlgallusgallususingaseriesofdenovoinvestigatingtools
AT arensburgerpeter deeplandscapeupdateofdispersedandtandemrepeatsinthegenomemodeloftheredjunglefowlgallusgallususingaseriesofdenovoinvestigatingtools
AT guillouflorian deeplandscapeupdateofdispersedandtandemrepeatsinthegenomemodeloftheredjunglefowlgallusgallususingaseriesofdenovoinvestigatingtools
AT bigotyves deeplandscapeupdateofdispersedandtandemrepeatsinthegenomemodeloftheredjunglefowlgallusgallususingaseriesofdenovoinvestigatingtools