Cargando…

Global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey

BACKGROUND: Extensive computational and database tools are available to mine genomic and genetic databases for model organisms, but little genomic data is available for many species of ecological or agricultural significance, especially those with large genomes. Genome surveys using conventional seq...

Descripción completa

Detalles Bibliográficos
Autores principales:	Swaminathan, Kankshita, Varala, Kranthi, Hudson, Matthew E
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2007
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1894642/ https://www.ncbi.nlm.nih.gov/pubmed/17524145 http://dx.doi.org/10.1186/1471-2164-8-132

_version_	1782133870980759552
author	Swaminathan, Kankshita Varala, Kranthi Hudson, Matthew E
author_facet	Swaminathan, Kankshita Varala, Kranthi Hudson, Matthew E
author_sort	Swaminathan, Kankshita
collection	PubMed
description	BACKGROUND: Extensive computational and database tools are available to mine genomic and genetic databases for model organisms, but little genomic data is available for many species of ecological or agricultural significance, especially those with large genomes. Genome surveys using conventional sequencing techniques are powerful, particularly for detecting sequences present in many copies per genome. However these methods are time-consuming and have potential drawbacks. High throughput 454 sequencing provides an alternative method by which much information can be gained quickly and cheaply from high-coverage surveys of genomic DNA. RESULTS: We sequenced 78 million base-pairs of randomly sheared soybean DNA which passed our quality criteria. Computational analysis of the survey sequences provided global information on the abundant repetitive sequences in soybean. The sequence was used to determine the copy number across regions of large genomic clones or contigs and discover higher-order structures within satellite repeats. We have created an annotated, online database of sequences present in multiple copies in the soybean genome. The low bias of pyrosequencing against repeat sequences is demonstrated by the overall composition of the survey data, which matches well with past estimates of repetitive DNA content obtained by DNA re-association kinetics (Cot analysis). CONCLUSION: This approach provides a potential aid to conventional or shotgun genome assembly, by allowing rapid assessment of copy number in any clone or clone-end sequence. In addition, we show that partial sequencing can provide access to partial protein-coding sequences.
format	Text
id	pubmed-1894642
institution	National Center for Biotechnology Information
language	English
publishDate	2007
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-18946422007-06-19 Global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey Swaminathan, Kankshita Varala, Kranthi Hudson, Matthew E BMC Genomics Research Article BACKGROUND: Extensive computational and database tools are available to mine genomic and genetic databases for model organisms, but little genomic data is available for many species of ecological or agricultural significance, especially those with large genomes. Genome surveys using conventional sequencing techniques are powerful, particularly for detecting sequences present in many copies per genome. However these methods are time-consuming and have potential drawbacks. High throughput 454 sequencing provides an alternative method by which much information can be gained quickly and cheaply from high-coverage surveys of genomic DNA. RESULTS: We sequenced 78 million base-pairs of randomly sheared soybean DNA which passed our quality criteria. Computational analysis of the survey sequences provided global information on the abundant repetitive sequences in soybean. The sequence was used to determine the copy number across regions of large genomic clones or contigs and discover higher-order structures within satellite repeats. We have created an annotated, online database of sequences present in multiple copies in the soybean genome. The low bias of pyrosequencing against repeat sequences is demonstrated by the overall composition of the survey data, which matches well with past estimates of repetitive DNA content obtained by DNA re-association kinetics (Cot analysis). CONCLUSION: This approach provides a potential aid to conventional or shotgun genome assembly, by allowing rapid assessment of copy number in any clone or clone-end sequence. In addition, we show that partial sequencing can provide access to partial protein-coding sequences. BioMed Central 2007-05-24 /pmc/articles/PMC1894642/ /pubmed/17524145 http://dx.doi.org/10.1186/1471-2164-8-132 Text en Copyright © 2007 Swaminathan et al; licensee BioMed Central Ltd. https://creativecommons.org/licenses/by/2.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0 (https://creativecommons.org/licenses/by/2.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Swaminathan, Kankshita Varala, Kranthi Hudson, Matthew E Global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey
title	Global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey
title_full	Global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey
title_fullStr	Global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey
title_full_unstemmed	Global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey
title_short	Global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey
title_sort	global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1894642/ https://www.ncbi.nlm.nih.gov/pubmed/17524145 http://dx.doi.org/10.1186/1471-2164-8-132
work_keys_str_mv	AT swaminathankankshita globalrepeatdiscoveryandestimationofgenomiccopynumberinalargecomplexgenomeusingahighthroughput454sequencesurvey AT varalakranthi globalrepeatdiscoveryandestimationofgenomiccopynumberinalargecomplexgenomeusingahighthroughput454sequencesurvey AT hudsonmatthewe globalrepeatdiscoveryandestimationofgenomiccopynumberinalargecomplexgenomeusingahighthroughput454sequencesurvey

Global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey

Ejemplares similares