Cargando…
A Computer Simulator for Assessing Different Challenges and Strategies of de Novo Sequence Assembly
This study presents a new computer program for assessing the effects of different factors and sequencing strategies on de novo sequence assembly. The program uses reads from actual sequencing studies or from simulations with a reference genome that may also be real or simulated. The simulated reads...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3954094/ https://www.ncbi.nlm.nih.gov/pubmed/24710045 http://dx.doi.org/10.3390/genes1020263 |
_version_ | 1782307435209293824 |
---|---|
author | Knudsen, Bjarne Forsberg, Roald Miyamoto, Michael M. |
author_facet | Knudsen, Bjarne Forsberg, Roald Miyamoto, Michael M. |
author_sort | Knudsen, Bjarne |
collection | PubMed |
description | This study presents a new computer program for assessing the effects of different factors and sequencing strategies on de novo sequence assembly. The program uses reads from actual sequencing studies or from simulations with a reference genome that may also be real or simulated. The simulated reads can be created with our read simulator. They can be of differing length and coverage, consist of paired reads with varying distance, and include sequencing errors such as color space miscalls to imitate SOLiD data. The simulated or real reads are mapped to their reference genome and our assembly simulator is then used to obtain optimal assemblies that are limited only by the distribution of repeats. By way of this mapping, the assembly simulator determines which contigs are theoretically possible, or conversely (and perhaps more importantly), which are not. We illustrate the application and utility of our new simulation tools with several experiments that test the effects of genome complexity (repeats), read length and coverage, word size in De Bruijn graph assembly, and alternative sequencing strategies (e.g., BAC pooling) on sequence assemblies. These experiments highlight just some of the uses of our simulators in the experimental design of sequencing projects and in the further development of assembly algorithms. |
format | Online Article Text |
id | pubmed-3954094 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-39540942014-03-26 A Computer Simulator for Assessing Different Challenges and Strategies of de Novo Sequence Assembly Knudsen, Bjarne Forsberg, Roald Miyamoto, Michael M. Genes (Basel) Article This study presents a new computer program for assessing the effects of different factors and sequencing strategies on de novo sequence assembly. The program uses reads from actual sequencing studies or from simulations with a reference genome that may also be real or simulated. The simulated reads can be created with our read simulator. They can be of differing length and coverage, consist of paired reads with varying distance, and include sequencing errors such as color space miscalls to imitate SOLiD data. The simulated or real reads are mapped to their reference genome and our assembly simulator is then used to obtain optimal assemblies that are limited only by the distribution of repeats. By way of this mapping, the assembly simulator determines which contigs are theoretically possible, or conversely (and perhaps more importantly), which are not. We illustrate the application and utility of our new simulation tools with several experiments that test the effects of genome complexity (repeats), read length and coverage, word size in De Bruijn graph assembly, and alternative sequencing strategies (e.g., BAC pooling) on sequence assemblies. These experiments highlight just some of the uses of our simulators in the experimental design of sequencing projects and in the further development of assembly algorithms. MDPI 2010-09-13 /pmc/articles/PMC3954094/ /pubmed/24710045 http://dx.doi.org/10.3390/genes1020263 Text en © 2010 by the authors; licensee MDPI, Basel, Switzerland http://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Article Knudsen, Bjarne Forsberg, Roald Miyamoto, Michael M. A Computer Simulator for Assessing Different Challenges and Strategies of de Novo Sequence Assembly |
title | A Computer Simulator for Assessing Different Challenges and Strategies of de Novo Sequence Assembly |
title_full | A Computer Simulator for Assessing Different Challenges and Strategies of de Novo Sequence Assembly |
title_fullStr | A Computer Simulator for Assessing Different Challenges and Strategies of de Novo Sequence Assembly |
title_full_unstemmed | A Computer Simulator for Assessing Different Challenges and Strategies of de Novo Sequence Assembly |
title_short | A Computer Simulator for Assessing Different Challenges and Strategies of de Novo Sequence Assembly |
title_sort | computer simulator for assessing different challenges and strategies of de novo sequence assembly |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3954094/ https://www.ncbi.nlm.nih.gov/pubmed/24710045 http://dx.doi.org/10.3390/genes1020263 |
work_keys_str_mv | AT knudsenbjarne acomputersimulatorforassessingdifferentchallengesandstrategiesofdenovosequenceassembly AT forsbergroald acomputersimulatorforassessingdifferentchallengesandstrategiesofdenovosequenceassembly AT miyamotomichaelm acomputersimulatorforassessingdifferentchallengesandstrategiesofdenovosequenceassembly AT knudsenbjarne computersimulatorforassessingdifferentchallengesandstrategiesofdenovosequenceassembly AT forsbergroald computersimulatorforassessingdifferentchallengesandstrategiesofdenovosequenceassembly AT miyamotomichaelm computersimulatorforassessingdifferentchallengesandstrategiesofdenovosequenceassembly |