Cargando…
Use of Low-Coverage, Large-Insert, Short-Read Data for Rapid and Accurate Generation of Enhanced-Quality Draft Pseudomonas Genome Sequences
Next-generation genomic technology has both greatly accelerated the pace of genome research as well as increased our reliance on draft genome sequences. While groups such as the Genomics Standards Consortium have made strong efforts to promote genome standards there is a still a general lack of unif...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3206934/ https://www.ncbi.nlm.nih.gov/pubmed/22073286 http://dx.doi.org/10.1371/journal.pone.0027199 |
_version_ | 1782215508014137344 |
---|---|
author | O'Brien, Heath E. Gong, Yunchen Fung, Pauline Wang, Pauline W. Guttman, David S. |
author_facet | O'Brien, Heath E. Gong, Yunchen Fung, Pauline Wang, Pauline W. Guttman, David S. |
author_sort | O'Brien, Heath E. |
collection | PubMed |
description | Next-generation genomic technology has both greatly accelerated the pace of genome research as well as increased our reliance on draft genome sequences. While groups such as the Genomics Standards Consortium have made strong efforts to promote genome standards there is a still a general lack of uniformity among published draft genomes, leading to challenges for downstream comparative analyses. This lack of uniformity is a particular problem when using standard draft genomes that frequently have large numbers of low-quality sequencing tracts. Here we present a proposal for an “enhanced-quality draft” genome that identifies at least 95% of the coding sequences, thereby effectively providing a full accounting of the genic component of the genome. Enhanced-quality draft genomes are easily attainable through a combination of small- and large-insert next-generation, paired-end sequencing. We illustrate the generation of an enhanced-quality draft genome by re-sequencing the plant pathogenic bacterium Pseudomonas syringae pv. phaseolicola 1448A (Pph 1448A), which has a published, closed genome sequence of 5.93 Mbp. We use a combination of Illumina paired-end and mate-pair sequencing, and surprisingly find that de novo assemblies with 100x paired-end coverage and mate-pair sequencing with as low as low as 2–5x coverage are substantially better than assemblies based on higher coverage. The rapid and low-cost generation of large numbers of enhanced-quality draft genome sequences will be of particular value for microbial diagnostics and biosecurity, which rely on precise discrimination of potentially dangerous clones from closely related benign strains. |
format | Online Article Text |
id | pubmed-3206934 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-32069342011-11-09 Use of Low-Coverage, Large-Insert, Short-Read Data for Rapid and Accurate Generation of Enhanced-Quality Draft Pseudomonas Genome Sequences O'Brien, Heath E. Gong, Yunchen Fung, Pauline Wang, Pauline W. Guttman, David S. PLoS One Research Article Next-generation genomic technology has both greatly accelerated the pace of genome research as well as increased our reliance on draft genome sequences. While groups such as the Genomics Standards Consortium have made strong efforts to promote genome standards there is a still a general lack of uniformity among published draft genomes, leading to challenges for downstream comparative analyses. This lack of uniformity is a particular problem when using standard draft genomes that frequently have large numbers of low-quality sequencing tracts. Here we present a proposal for an “enhanced-quality draft” genome that identifies at least 95% of the coding sequences, thereby effectively providing a full accounting of the genic component of the genome. Enhanced-quality draft genomes are easily attainable through a combination of small- and large-insert next-generation, paired-end sequencing. We illustrate the generation of an enhanced-quality draft genome by re-sequencing the plant pathogenic bacterium Pseudomonas syringae pv. phaseolicola 1448A (Pph 1448A), which has a published, closed genome sequence of 5.93 Mbp. We use a combination of Illumina paired-end and mate-pair sequencing, and surprisingly find that de novo assemblies with 100x paired-end coverage and mate-pair sequencing with as low as low as 2–5x coverage are substantially better than assemblies based on higher coverage. The rapid and low-cost generation of large numbers of enhanced-quality draft genome sequences will be of particular value for microbial diagnostics and biosecurity, which rely on precise discrimination of potentially dangerous clones from closely related benign strains. Public Library of Science 2011-11-02 /pmc/articles/PMC3206934/ /pubmed/22073286 http://dx.doi.org/10.1371/journal.pone.0027199 Text en O'Brien et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article O'Brien, Heath E. Gong, Yunchen Fung, Pauline Wang, Pauline W. Guttman, David S. Use of Low-Coverage, Large-Insert, Short-Read Data for Rapid and Accurate Generation of Enhanced-Quality Draft Pseudomonas Genome Sequences |
title | Use of Low-Coverage, Large-Insert, Short-Read Data for Rapid and Accurate Generation of Enhanced-Quality Draft Pseudomonas Genome Sequences |
title_full | Use of Low-Coverage, Large-Insert, Short-Read Data for Rapid and Accurate Generation of Enhanced-Quality Draft Pseudomonas Genome Sequences |
title_fullStr | Use of Low-Coverage, Large-Insert, Short-Read Data for Rapid and Accurate Generation of Enhanced-Quality Draft Pseudomonas Genome Sequences |
title_full_unstemmed | Use of Low-Coverage, Large-Insert, Short-Read Data for Rapid and Accurate Generation of Enhanced-Quality Draft Pseudomonas Genome Sequences |
title_short | Use of Low-Coverage, Large-Insert, Short-Read Data for Rapid and Accurate Generation of Enhanced-Quality Draft Pseudomonas Genome Sequences |
title_sort | use of low-coverage, large-insert, short-read data for rapid and accurate generation of enhanced-quality draft pseudomonas genome sequences |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3206934/ https://www.ncbi.nlm.nih.gov/pubmed/22073286 http://dx.doi.org/10.1371/journal.pone.0027199 |
work_keys_str_mv | AT obrienheathe useoflowcoveragelargeinsertshortreaddataforrapidandaccurategenerationofenhancedqualitydraftpseudomonasgenomesequences AT gongyunchen useoflowcoveragelargeinsertshortreaddataforrapidandaccurategenerationofenhancedqualitydraftpseudomonasgenomesequences AT fungpauline useoflowcoveragelargeinsertshortreaddataforrapidandaccurategenerationofenhancedqualitydraftpseudomonasgenomesequences AT wangpaulinew useoflowcoveragelargeinsertshortreaddataforrapidandaccurategenerationofenhancedqualitydraftpseudomonasgenomesequences AT guttmandavids useoflowcoveragelargeinsertshortreaddataforrapidandaccurategenerationofenhancedqualitydraftpseudomonasgenomesequences |