Cargando…
Simplifier: a web tool to eliminate redundant NGS contigs
Modern genomic sequencing technologies produce a large amount of data with reduced cost per base; however, this data consists of short reads. This reduction in the size of the reads, compared to those obtained with previous methodologies, presents new challenges, including a need for efficient algor...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Biomedical Informatics
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3524941/ https://www.ncbi.nlm.nih.gov/pubmed/23275695 http://dx.doi.org/10.6026/97320630008996 |
_version_ | 1782253374869078016 |
---|---|
author | Ramos, Rommel Thiago Jucá Carneiro, Adriana Ribeiro Azevedo, Vasco Schneider, Maria Paula Barh, Debmalya Silva, Artur |
author_facet | Ramos, Rommel Thiago Jucá Carneiro, Adriana Ribeiro Azevedo, Vasco Schneider, Maria Paula Barh, Debmalya Silva, Artur |
author_sort | Ramos, Rommel Thiago Jucá |
collection | PubMed |
description | Modern genomic sequencing technologies produce a large amount of data with reduced cost per base; however, this data consists of short reads. This reduction in the size of the reads, compared to those obtained with previous methodologies, presents new challenges, including a need for efficient algorithms for the assembly of genomes from short reads and for resolving repetitions. Additionally after abinitio assembly, curation of the hundreds or thousands of contigs generated by assemblers demands considerable time and computational resources. We developed Simplifier, a stand-alone software that selectively eliminates redundant sequences from the collection of contigs generated by ab initio assembly of genomes. Application of Simplifier to data generated by assembly of the genome of Corynebacterium pseudotuberculosis strain 258 reduced the number of contigs generated by ab initio methods from 8,004 to 5,272, a reduction of 34.14%; in addition, N50 increased from 1 kb to 1.5 kb. Processing the contigs of Escherichia coli DH10B with Simplifier reduced the mate-paired library 17.47% and the fragment library 23.91%. Simplifier removed redundant sequences from datasets produced by assemblers, thereby reducing the effort required for finalization of genome assembly in tests with data from Prokaryotic organisms. AVAILABILITY: Simplifier is available at http://www.genoma.ufpa.br/rramos/softwares/simplifier.xhtmlIt requires Sun jdk 6 or higher. |
format | Online Article Text |
id | pubmed-3524941 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Biomedical Informatics |
record_format | MEDLINE/PubMed |
spelling | pubmed-35249412012-12-28 Simplifier: a web tool to eliminate redundant NGS contigs Ramos, Rommel Thiago Jucá Carneiro, Adriana Ribeiro Azevedo, Vasco Schneider, Maria Paula Barh, Debmalya Silva, Artur Bioinformation Software Modern genomic sequencing technologies produce a large amount of data with reduced cost per base; however, this data consists of short reads. This reduction in the size of the reads, compared to those obtained with previous methodologies, presents new challenges, including a need for efficient algorithms for the assembly of genomes from short reads and for resolving repetitions. Additionally after abinitio assembly, curation of the hundreds or thousands of contigs generated by assemblers demands considerable time and computational resources. We developed Simplifier, a stand-alone software that selectively eliminates redundant sequences from the collection of contigs generated by ab initio assembly of genomes. Application of Simplifier to data generated by assembly of the genome of Corynebacterium pseudotuberculosis strain 258 reduced the number of contigs generated by ab initio methods from 8,004 to 5,272, a reduction of 34.14%; in addition, N50 increased from 1 kb to 1.5 kb. Processing the contigs of Escherichia coli DH10B with Simplifier reduced the mate-paired library 17.47% and the fragment library 23.91%. Simplifier removed redundant sequences from datasets produced by assemblers, thereby reducing the effort required for finalization of genome assembly in tests with data from Prokaryotic organisms. AVAILABILITY: Simplifier is available at http://www.genoma.ufpa.br/rramos/softwares/simplifier.xhtmlIt requires Sun jdk 6 or higher. Biomedical Informatics 2012-10-13 /pmc/articles/PMC3524941/ /pubmed/23275695 http://dx.doi.org/10.6026/97320630008996 Text en © 2012 Biomedical Informatics This is an open-access article, which permits unrestricted use, distribution, and reproduction in any medium, for non-commercial purposes, provided the original author and source are credited. |
spellingShingle | Software Ramos, Rommel Thiago Jucá Carneiro, Adriana Ribeiro Azevedo, Vasco Schneider, Maria Paula Barh, Debmalya Silva, Artur Simplifier: a web tool to eliminate redundant NGS contigs |
title | Simplifier: a web tool to eliminate redundant NGS contigs |
title_full | Simplifier: a web tool to eliminate redundant NGS contigs |
title_fullStr | Simplifier: a web tool to eliminate redundant NGS contigs |
title_full_unstemmed | Simplifier: a web tool to eliminate redundant NGS contigs |
title_short | Simplifier: a web tool to eliminate redundant NGS contigs |
title_sort | simplifier: a web tool to eliminate redundant ngs contigs |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3524941/ https://www.ncbi.nlm.nih.gov/pubmed/23275695 http://dx.doi.org/10.6026/97320630008996 |
work_keys_str_mv | AT ramosrommelthiagojuca simplifierawebtooltoeliminateredundantngscontigs AT carneiroadrianaribeiro simplifierawebtooltoeliminateredundantngscontigs AT azevedovasco simplifierawebtooltoeliminateredundantngscontigs AT schneidermariapaula simplifierawebtooltoeliminateredundantngscontigs AT barhdebmalya simplifierawebtooltoeliminateredundantngscontigs AT silvaartur simplifierawebtooltoeliminateredundantngscontigs |