Cargando…

Low-pass shotgun sequencing of the barley genome facilitates rapid identification of genes, conserved non-coding sequences and novel repeats

BACKGROUND: Barley has one of the largest and most complex genomes of all economically important food crops. The rise of new short read sequencing technologies such as Illumina/Solexa permits such large genomes to be effectively sampled at relatively low cost. Based on the corresponding sequence rea...

Descripción completa

Detalles Bibliográficos
Autores principales: Wicker, Thomas, Narechania, Apurva, Sabot, Francois, Stein, Joshua, Vu, Giang TH, Graner, Andreas, Ware, Doreen, Stein, Nils
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2584661/
https://www.ncbi.nlm.nih.gov/pubmed/18976483
http://dx.doi.org/10.1186/1471-2164-9-518
_version_ 1782160816658710528
author Wicker, Thomas
Narechania, Apurva
Sabot, Francois
Stein, Joshua
Vu, Giang TH
Graner, Andreas
Ware, Doreen
Stein, Nils
author_facet Wicker, Thomas
Narechania, Apurva
Sabot, Francois
Stein, Joshua
Vu, Giang TH
Graner, Andreas
Ware, Doreen
Stein, Nils
author_sort Wicker, Thomas
collection PubMed
description BACKGROUND: Barley has one of the largest and most complex genomes of all economically important food crops. The rise of new short read sequencing technologies such as Illumina/Solexa permits such large genomes to be effectively sampled at relatively low cost. Based on the corresponding sequence reads a Mathematically Defined Repeat (MDR) index can be generated to map repetitive regions in genomic sequences. RESULTS: We have generated 574 Mbp of Illumina/Solexa sequences from barley total genomic DNA, representing about 10% of a genome equivalent. From these sequences we generated an MDR index which was then used to identify and mark repetitive regions in the barley genome. Comparison of the MDR plots with expert repeat annotation drawing on the information already available for known repetitive elements revealed a significant correspondence between the two methods. MDR-based annotation allowed for the identification of dozens of novel repeat sequences, though, which were not recognised by hand-annotation. The MDR data was also used to identify gene-containing regions by masking of repetitive sequences in eight de-novo sequenced bacterial artificial chromosome (BAC) clones. For half of the identified candidate gene islands indeed gene sequences could be identified. MDR data were only of limited use, when mapped on genomic sequences from the closely related species Triticum monococcum as only a fraction of the repetitive sequences was recognised. CONCLUSION: An MDR index for barley, which was obtained by whole-genome Illumina/Solexa sequencing, proved as efficient in repeat identification as manual expert annotation. Circumventing the labour-intensive step of producing a specific repeat library for expert annotation, an MDR index provides an elegant and efficient resource for the identification of repetitive and low-copy (i.e. potentially gene-containing sequences) regions in uncharacterised genomic sequences. The restriction that a particular MDR index can not be used across species is outweighed by the low costs of Illumina/Solexa sequencing which makes any chosen genome accessible for whole-genome sequence sampling.
format Text
id pubmed-2584661
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-25846612008-11-19 Low-pass shotgun sequencing of the barley genome facilitates rapid identification of genes, conserved non-coding sequences and novel repeats Wicker, Thomas Narechania, Apurva Sabot, Francois Stein, Joshua Vu, Giang TH Graner, Andreas Ware, Doreen Stein, Nils BMC Genomics Research Article BACKGROUND: Barley has one of the largest and most complex genomes of all economically important food crops. The rise of new short read sequencing technologies such as Illumina/Solexa permits such large genomes to be effectively sampled at relatively low cost. Based on the corresponding sequence reads a Mathematically Defined Repeat (MDR) index can be generated to map repetitive regions in genomic sequences. RESULTS: We have generated 574 Mbp of Illumina/Solexa sequences from barley total genomic DNA, representing about 10% of a genome equivalent. From these sequences we generated an MDR index which was then used to identify and mark repetitive regions in the barley genome. Comparison of the MDR plots with expert repeat annotation drawing on the information already available for known repetitive elements revealed a significant correspondence between the two methods. MDR-based annotation allowed for the identification of dozens of novel repeat sequences, though, which were not recognised by hand-annotation. The MDR data was also used to identify gene-containing regions by masking of repetitive sequences in eight de-novo sequenced bacterial artificial chromosome (BAC) clones. For half of the identified candidate gene islands indeed gene sequences could be identified. MDR data were only of limited use, when mapped on genomic sequences from the closely related species Triticum monococcum as only a fraction of the repetitive sequences was recognised. CONCLUSION: An MDR index for barley, which was obtained by whole-genome Illumina/Solexa sequencing, proved as efficient in repeat identification as manual expert annotation. Circumventing the labour-intensive step of producing a specific repeat library for expert annotation, an MDR index provides an elegant and efficient resource for the identification of repetitive and low-copy (i.e. potentially gene-containing sequences) regions in uncharacterised genomic sequences. The restriction that a particular MDR index can not be used across species is outweighed by the low costs of Illumina/Solexa sequencing which makes any chosen genome accessible for whole-genome sequence sampling. BioMed Central 2008-10-31 /pmc/articles/PMC2584661/ /pubmed/18976483 http://dx.doi.org/10.1186/1471-2164-9-518 Text en Copyright © 2008 Wicker et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Wicker, Thomas
Narechania, Apurva
Sabot, Francois
Stein, Joshua
Vu, Giang TH
Graner, Andreas
Ware, Doreen
Stein, Nils
Low-pass shotgun sequencing of the barley genome facilitates rapid identification of genes, conserved non-coding sequences and novel repeats
title Low-pass shotgun sequencing of the barley genome facilitates rapid identification of genes, conserved non-coding sequences and novel repeats
title_full Low-pass shotgun sequencing of the barley genome facilitates rapid identification of genes, conserved non-coding sequences and novel repeats
title_fullStr Low-pass shotgun sequencing of the barley genome facilitates rapid identification of genes, conserved non-coding sequences and novel repeats
title_full_unstemmed Low-pass shotgun sequencing of the barley genome facilitates rapid identification of genes, conserved non-coding sequences and novel repeats
title_short Low-pass shotgun sequencing of the barley genome facilitates rapid identification of genes, conserved non-coding sequences and novel repeats
title_sort low-pass shotgun sequencing of the barley genome facilitates rapid identification of genes, conserved non-coding sequences and novel repeats
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2584661/
https://www.ncbi.nlm.nih.gov/pubmed/18976483
http://dx.doi.org/10.1186/1471-2164-9-518
work_keys_str_mv AT wickerthomas lowpassshotgunsequencingofthebarleygenomefacilitatesrapididentificationofgenesconservednoncodingsequencesandnovelrepeats
AT narechaniaapurva lowpassshotgunsequencingofthebarleygenomefacilitatesrapididentificationofgenesconservednoncodingsequencesandnovelrepeats
AT sabotfrancois lowpassshotgunsequencingofthebarleygenomefacilitatesrapididentificationofgenesconservednoncodingsequencesandnovelrepeats
AT steinjoshua lowpassshotgunsequencingofthebarleygenomefacilitatesrapididentificationofgenesconservednoncodingsequencesandnovelrepeats
AT vugiangth lowpassshotgunsequencingofthebarleygenomefacilitatesrapididentificationofgenesconservednoncodingsequencesandnovelrepeats
AT granerandreas lowpassshotgunsequencingofthebarleygenomefacilitatesrapididentificationofgenesconservednoncodingsequencesandnovelrepeats
AT waredoreen lowpassshotgunsequencingofthebarleygenomefacilitatesrapididentificationofgenesconservednoncodingsequencesandnovelrepeats
AT steinnils lowpassshotgunsequencingofthebarleygenomefacilitatesrapididentificationofgenesconservednoncodingsequencesandnovelrepeats