Cargando…

Plantagora: Modeling Whole Genome Sequencing and Assembly of Plant Genomes

BACKGROUND: Genomics studies are being revolutionized by the next generation sequencing technologies, which have made whole genome sequencing much more accessible to the average researcher. Whole genome sequencing with the new technologies is a developing art that, despite the large volumes of data...

Descripción completa

Detalles Bibliográficos
Autores principales: Barthelson, Roger, McFarlin, Adam J., Rounsley, Steven D., Young, Sarah
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3236183/
https://www.ncbi.nlm.nih.gov/pubmed/22174807
http://dx.doi.org/10.1371/journal.pone.0028436
_version_ 1782218698990288896
author Barthelson, Roger
McFarlin, Adam J.
Rounsley, Steven D.
Young, Sarah
author_facet Barthelson, Roger
McFarlin, Adam J.
Rounsley, Steven D.
Young, Sarah
author_sort Barthelson, Roger
collection PubMed
description BACKGROUND: Genomics studies are being revolutionized by the next generation sequencing technologies, which have made whole genome sequencing much more accessible to the average researcher. Whole genome sequencing with the new technologies is a developing art that, despite the large volumes of data that can be produced, may still fail to provide a clear and thorough map of a genome. The Plantagora project was conceived to address specifically the gap between having the technical tools for genome sequencing and knowing precisely the best way to use them. METHODOLOGY/PRINCIPAL FINDINGS: For Plantagora, a platform was created for generating simulated reads from several different plant genomes of different sizes. The resulting read files mimicked either 454 or Illumina reads, with varying paired end spacing. Thousands of datasets of reads were created, most derived from our primary model genome, rice chromosome one. All reads were assembled with different software assemblers, including Newbler, Abyss, and SOAPdenovo, and the resulting assemblies were evaluated by an extensive battery of metrics chosen for these studies. The metrics included both statistics of the assembly sequences and fidelity-related measures derived by alignment of the assemblies to the original genome source for the reads. The results were presented in a website, which includes a data graphing tool, all created to help the user compare rapidly the feasibility and effectiveness of different sequencing and assembly strategies prior to testing an approach in the lab. Some of our own conclusions regarding the different strategies were also recorded on the website. CONCLUSIONS/SIGNIFICANCE: Plantagora provides a substantial body of information for comparing different approaches to sequencing a plant genome, and some conclusions regarding some of the specific approaches. Plantagora also provides a platform of metrics and tools for studying the process of sequencing and assembly further.
format Online
Article
Text
id pubmed-3236183
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-32361832011-12-15 Plantagora: Modeling Whole Genome Sequencing and Assembly of Plant Genomes Barthelson, Roger McFarlin, Adam J. Rounsley, Steven D. Young, Sarah PLoS One Research Article BACKGROUND: Genomics studies are being revolutionized by the next generation sequencing technologies, which have made whole genome sequencing much more accessible to the average researcher. Whole genome sequencing with the new technologies is a developing art that, despite the large volumes of data that can be produced, may still fail to provide a clear and thorough map of a genome. The Plantagora project was conceived to address specifically the gap between having the technical tools for genome sequencing and knowing precisely the best way to use them. METHODOLOGY/PRINCIPAL FINDINGS: For Plantagora, a platform was created for generating simulated reads from several different plant genomes of different sizes. The resulting read files mimicked either 454 or Illumina reads, with varying paired end spacing. Thousands of datasets of reads were created, most derived from our primary model genome, rice chromosome one. All reads were assembled with different software assemblers, including Newbler, Abyss, and SOAPdenovo, and the resulting assemblies were evaluated by an extensive battery of metrics chosen for these studies. The metrics included both statistics of the assembly sequences and fidelity-related measures derived by alignment of the assemblies to the original genome source for the reads. The results were presented in a website, which includes a data graphing tool, all created to help the user compare rapidly the feasibility and effectiveness of different sequencing and assembly strategies prior to testing an approach in the lab. Some of our own conclusions regarding the different strategies were also recorded on the website. CONCLUSIONS/SIGNIFICANCE: Plantagora provides a substantial body of information for comparing different approaches to sequencing a plant genome, and some conclusions regarding some of the specific approaches. Plantagora also provides a platform of metrics and tools for studying the process of sequencing and assembly further. Public Library of Science 2011-12-12 /pmc/articles/PMC3236183/ /pubmed/22174807 http://dx.doi.org/10.1371/journal.pone.0028436 Text en Barthelson et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Barthelson, Roger
McFarlin, Adam J.
Rounsley, Steven D.
Young, Sarah
Plantagora: Modeling Whole Genome Sequencing and Assembly of Plant Genomes
title Plantagora: Modeling Whole Genome Sequencing and Assembly of Plant Genomes
title_full Plantagora: Modeling Whole Genome Sequencing and Assembly of Plant Genomes
title_fullStr Plantagora: Modeling Whole Genome Sequencing and Assembly of Plant Genomes
title_full_unstemmed Plantagora: Modeling Whole Genome Sequencing and Assembly of Plant Genomes
title_short Plantagora: Modeling Whole Genome Sequencing and Assembly of Plant Genomes
title_sort plantagora: modeling whole genome sequencing and assembly of plant genomes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3236183/
https://www.ncbi.nlm.nih.gov/pubmed/22174807
http://dx.doi.org/10.1371/journal.pone.0028436
work_keys_str_mv AT barthelsonroger plantagoramodelingwholegenomesequencingandassemblyofplantgenomes
AT mcfarlinadamj plantagoramodelingwholegenomesequencingandassemblyofplantgenomes
AT rounsleystevend plantagoramodelingwholegenomesequencingandassemblyofplantgenomes
AT youngsarah plantagoramodelingwholegenomesequencingandassemblyofplantgenomes