Cargando…

An empirical study of choosing efficient discriminative seeds for oligonucleotide design

BACKGROUND: Oligonucleotide design is known as a time-consuming work in bioinformatics. In order to accelerate and be efficient the oligonucleotide design process, one of widely used approach is the prescreening unreliable regions using a hashing (or seeding) algorithm. Since the seeding algorithm i...

Descripción completa

Detalles Bibliográficos
Autores principales: Chung, Won-Hyoung, Park, Seong-Bae
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2788383/
https://www.ncbi.nlm.nih.gov/pubmed/19958494
http://dx.doi.org/10.1186/1471-2164-10-S3-S3
_version_ 1782174969139036160
author Chung, Won-Hyoung
Park, Seong-Bae
author_facet Chung, Won-Hyoung
Park, Seong-Bae
author_sort Chung, Won-Hyoung
collection PubMed
description BACKGROUND: Oligonucleotide design is known as a time-consuming work in bioinformatics. In order to accelerate and be efficient the oligonucleotide design process, one of widely used approach is the prescreening unreliable regions using a hashing (or seeding) algorithm. Since the seeding algorithm is originally proposed to increase sensitivity for local alignment, the specificity should be considered as well as the sensitivity for the oligonucleotide design problem. However, a measure of evaluating the seeds regarding how adequate and efficient they are in the oligo design is not yet proposed. Here, we propose novel measures of evaluating the seeding algorithms based on the discriminability and the efficiency. RESULTS: To evaluate the proposed measures, we examine five seeding algorithms in oligonucleotide design. We carried out a series of experiments to compare the seeding algorithms. As the result, the spaced seed is recorded as the most efficient discriminative seed for oligo design. The performance of transition-constrained seed is slightly lower than the spaced seed. Because BLAT seeding algorithm and Vector seeding algorithm give poor scores in specificity and efficiency, we conclude that these algorithms are not adequate to design oligos. Consequently, we recommend spaced seeds or transition-constrained seeds with 15~18 weight in order to design oligos with the length of 50 mer. The empirical experiments in real biological data reveal that the recommended seeds show consequently good performance. We also propose a software package which enables the users to get the adequate seeds under their own experimental conditions. CONCLUSION: Our study is valuable to the two points. One is that our study can be applied to the oligo design programs in order to improve the performance by suggesting the experiment-specific seeds. The other is that our study is useful to improve the performance of the mapping assembly in the field of Next-Generation Sequencing. Our proposed measures are originally designed to be used for oligo design but we expect that our study will be helpful to the other genomic tasks.
format Text
id pubmed-2788383
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27883832009-12-04 An empirical study of choosing efficient discriminative seeds for oligonucleotide design Chung, Won-Hyoung Park, Seong-Bae BMC Genomics Proceedings BACKGROUND: Oligonucleotide design is known as a time-consuming work in bioinformatics. In order to accelerate and be efficient the oligonucleotide design process, one of widely used approach is the prescreening unreliable regions using a hashing (or seeding) algorithm. Since the seeding algorithm is originally proposed to increase sensitivity for local alignment, the specificity should be considered as well as the sensitivity for the oligonucleotide design problem. However, a measure of evaluating the seeds regarding how adequate and efficient they are in the oligo design is not yet proposed. Here, we propose novel measures of evaluating the seeding algorithms based on the discriminability and the efficiency. RESULTS: To evaluate the proposed measures, we examine five seeding algorithms in oligonucleotide design. We carried out a series of experiments to compare the seeding algorithms. As the result, the spaced seed is recorded as the most efficient discriminative seed for oligo design. The performance of transition-constrained seed is slightly lower than the spaced seed. Because BLAT seeding algorithm and Vector seeding algorithm give poor scores in specificity and efficiency, we conclude that these algorithms are not adequate to design oligos. Consequently, we recommend spaced seeds or transition-constrained seeds with 15~18 weight in order to design oligos with the length of 50 mer. The empirical experiments in real biological data reveal that the recommended seeds show consequently good performance. We also propose a software package which enables the users to get the adequate seeds under their own experimental conditions. CONCLUSION: Our study is valuable to the two points. One is that our study can be applied to the oligo design programs in order to improve the performance by suggesting the experiment-specific seeds. The other is that our study is useful to improve the performance of the mapping assembly in the field of Next-Generation Sequencing. Our proposed measures are originally designed to be used for oligo design but we expect that our study will be helpful to the other genomic tasks. BioMed Central 2009-12-03 /pmc/articles/PMC2788383/ /pubmed/19958494 http://dx.doi.org/10.1186/1471-2164-10-S3-S3 Text en Copyright ©2009 Chung and Park; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Chung, Won-Hyoung
Park, Seong-Bae
An empirical study of choosing efficient discriminative seeds for oligonucleotide design
title An empirical study of choosing efficient discriminative seeds for oligonucleotide design
title_full An empirical study of choosing efficient discriminative seeds for oligonucleotide design
title_fullStr An empirical study of choosing efficient discriminative seeds for oligonucleotide design
title_full_unstemmed An empirical study of choosing efficient discriminative seeds for oligonucleotide design
title_short An empirical study of choosing efficient discriminative seeds for oligonucleotide design
title_sort empirical study of choosing efficient discriminative seeds for oligonucleotide design
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2788383/
https://www.ncbi.nlm.nih.gov/pubmed/19958494
http://dx.doi.org/10.1186/1471-2164-10-S3-S3
work_keys_str_mv AT chungwonhyoung anempiricalstudyofchoosingefficientdiscriminativeseedsforoligonucleotidedesign
AT parkseongbae anempiricalstudyofchoosingefficientdiscriminativeseedsforoligonucleotidedesign
AT chungwonhyoung empiricalstudyofchoosingefficientdiscriminativeseedsforoligonucleotidedesign
AT parkseongbae empiricalstudyofchoosingefficientdiscriminativeseedsforoligonucleotidedesign