Cargando…

Use of Low-Coverage, Large-Insert, Short-Read Data for Rapid and Accurate Generation of Enhanced-Quality Draft Pseudomonas Genome Sequences

Next-generation genomic technology has both greatly accelerated the pace of genome research as well as increased our reliance on draft genome sequences. While groups such as the Genomics Standards Consortium have made strong efforts to promote genome standards there is a still a general lack of unif...

Descripción completa

Detalles Bibliográficos
Autores principales: O'Brien, Heath E., Gong, Yunchen, Fung, Pauline, Wang, Pauline W., Guttman, David S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3206934/
https://www.ncbi.nlm.nih.gov/pubmed/22073286
http://dx.doi.org/10.1371/journal.pone.0027199
_version_ 1782215508014137344
author O'Brien, Heath E.
Gong, Yunchen
Fung, Pauline
Wang, Pauline W.
Guttman, David S.
author_facet O'Brien, Heath E.
Gong, Yunchen
Fung, Pauline
Wang, Pauline W.
Guttman, David S.
author_sort O'Brien, Heath E.
collection PubMed
description Next-generation genomic technology has both greatly accelerated the pace of genome research as well as increased our reliance on draft genome sequences. While groups such as the Genomics Standards Consortium have made strong efforts to promote genome standards there is a still a general lack of uniformity among published draft genomes, leading to challenges for downstream comparative analyses. This lack of uniformity is a particular problem when using standard draft genomes that frequently have large numbers of low-quality sequencing tracts. Here we present a proposal for an “enhanced-quality draft” genome that identifies at least 95% of the coding sequences, thereby effectively providing a full accounting of the genic component of the genome. Enhanced-quality draft genomes are easily attainable through a combination of small- and large-insert next-generation, paired-end sequencing. We illustrate the generation of an enhanced-quality draft genome by re-sequencing the plant pathogenic bacterium Pseudomonas syringae pv. phaseolicola 1448A (Pph 1448A), which has a published, closed genome sequence of 5.93 Mbp. We use a combination of Illumina paired-end and mate-pair sequencing, and surprisingly find that de novo assemblies with 100x paired-end coverage and mate-pair sequencing with as low as low as 2–5x coverage are substantially better than assemblies based on higher coverage. The rapid and low-cost generation of large numbers of enhanced-quality draft genome sequences will be of particular value for microbial diagnostics and biosecurity, which rely on precise discrimination of potentially dangerous clones from closely related benign strains.
format Online
Article
Text
id pubmed-3206934
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-32069342011-11-09 Use of Low-Coverage, Large-Insert, Short-Read Data for Rapid and Accurate Generation of Enhanced-Quality Draft Pseudomonas Genome Sequences O'Brien, Heath E. Gong, Yunchen Fung, Pauline Wang, Pauline W. Guttman, David S. PLoS One Research Article Next-generation genomic technology has both greatly accelerated the pace of genome research as well as increased our reliance on draft genome sequences. While groups such as the Genomics Standards Consortium have made strong efforts to promote genome standards there is a still a general lack of uniformity among published draft genomes, leading to challenges for downstream comparative analyses. This lack of uniformity is a particular problem when using standard draft genomes that frequently have large numbers of low-quality sequencing tracts. Here we present a proposal for an “enhanced-quality draft” genome that identifies at least 95% of the coding sequences, thereby effectively providing a full accounting of the genic component of the genome. Enhanced-quality draft genomes are easily attainable through a combination of small- and large-insert next-generation, paired-end sequencing. We illustrate the generation of an enhanced-quality draft genome by re-sequencing the plant pathogenic bacterium Pseudomonas syringae pv. phaseolicola 1448A (Pph 1448A), which has a published, closed genome sequence of 5.93 Mbp. We use a combination of Illumina paired-end and mate-pair sequencing, and surprisingly find that de novo assemblies with 100x paired-end coverage and mate-pair sequencing with as low as low as 2–5x coverage are substantially better than assemblies based on higher coverage. The rapid and low-cost generation of large numbers of enhanced-quality draft genome sequences will be of particular value for microbial diagnostics and biosecurity, which rely on precise discrimination of potentially dangerous clones from closely related benign strains. Public Library of Science 2011-11-02 /pmc/articles/PMC3206934/ /pubmed/22073286 http://dx.doi.org/10.1371/journal.pone.0027199 Text en O'Brien et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
O'Brien, Heath E.
Gong, Yunchen
Fung, Pauline
Wang, Pauline W.
Guttman, David S.
Use of Low-Coverage, Large-Insert, Short-Read Data for Rapid and Accurate Generation of Enhanced-Quality Draft Pseudomonas Genome Sequences
title Use of Low-Coverage, Large-Insert, Short-Read Data for Rapid and Accurate Generation of Enhanced-Quality Draft Pseudomonas Genome Sequences
title_full Use of Low-Coverage, Large-Insert, Short-Read Data for Rapid and Accurate Generation of Enhanced-Quality Draft Pseudomonas Genome Sequences
title_fullStr Use of Low-Coverage, Large-Insert, Short-Read Data for Rapid and Accurate Generation of Enhanced-Quality Draft Pseudomonas Genome Sequences
title_full_unstemmed Use of Low-Coverage, Large-Insert, Short-Read Data for Rapid and Accurate Generation of Enhanced-Quality Draft Pseudomonas Genome Sequences
title_short Use of Low-Coverage, Large-Insert, Short-Read Data for Rapid and Accurate Generation of Enhanced-Quality Draft Pseudomonas Genome Sequences
title_sort use of low-coverage, large-insert, short-read data for rapid and accurate generation of enhanced-quality draft pseudomonas genome sequences
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3206934/
https://www.ncbi.nlm.nih.gov/pubmed/22073286
http://dx.doi.org/10.1371/journal.pone.0027199
work_keys_str_mv AT obrienheathe useoflowcoveragelargeinsertshortreaddataforrapidandaccurategenerationofenhancedqualitydraftpseudomonasgenomesequences
AT gongyunchen useoflowcoveragelargeinsertshortreaddataforrapidandaccurategenerationofenhancedqualitydraftpseudomonasgenomesequences
AT fungpauline useoflowcoveragelargeinsertshortreaddataforrapidandaccurategenerationofenhancedqualitydraftpseudomonasgenomesequences
AT wangpaulinew useoflowcoveragelargeinsertshortreaddataforrapidandaccurategenerationofenhancedqualitydraftpseudomonasgenomesequences
AT guttmandavids useoflowcoveragelargeinsertshortreaddataforrapidandaccurategenerationofenhancedqualitydraftpseudomonasgenomesequences