Cargando…

Fast comparison of DNA sequences by oligonucleotide profiling

BACKGROUND: The comparison of DNA sequences is a traditional problem in genomics and bioinformatics. Many new opportunities emerge due to the improvement of personal computers, allowing the implementation of novel strategies of analysis. FINDINGS: We describe a new program, called UVWORD, which dete...

Descripción completa

Detalles Bibliográficos
Autores principales:	Arnau, Vicente, Gallach, Miguel, Marín, Ignacio
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2008
Materias:	Technical Note
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2518268/ https://www.ncbi.nlm.nih.gov/pubmed/18710530 http://dx.doi.org/10.1186/1756-0500-1-5

_version_	1782158553496158208
author	Arnau, Vicente Gallach, Miguel Marín, Ignacio
author_facet	Arnau, Vicente Gallach, Miguel Marín, Ignacio
author_sort	Arnau, Vicente
collection	PubMed
description	BACKGROUND: The comparison of DNA sequences is a traditional problem in genomics and bioinformatics. Many new opportunities emerge due to the improvement of personal computers, allowing the implementation of novel strategies of analysis. FINDINGS: We describe a new program, called UVWORD, which determines the number of times that each DNA word present in a sequence (target) is found in a second sequence (source), a procedure that we have called oligonucleotide profiling. On a standard computer, the user may search for words of a size ranging from k = 1 to k = 14 nucleotides. Average counts for groups of contiguous words may also be established. The rate of analysis on standard computers is from 3.4 (k = 14) to 16 millions of words per second (1 ≤ k ≤ 8). This makes feasible the fast screening of even the longest known DNA molecules. DISCUSSION: We show that the combination of the ability of analyzing words of relatively long size, which occur very rarely by chance, and the fast speed of the program allows to perform novel types of screenings, complementary to those provided by standard programs such as BLAST. This method can be used to determine oligonucleotide content, to characterize the distribution of repetitive sequences in chromosomes, to determine the evolutionary conservation of sequences in different species, to establish regions of similar DNA among chromosomes or genomes, etc.
format	Text
id	pubmed-2518268
institution	National Center for Biotechnology Information
language	English
publishDate	2008
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-25182682008-08-21 Fast comparison of DNA sequences by oligonucleotide profiling Arnau, Vicente Gallach, Miguel Marín, Ignacio BMC Res Notes Technical Note BACKGROUND: The comparison of DNA sequences is a traditional problem in genomics and bioinformatics. Many new opportunities emerge due to the improvement of personal computers, allowing the implementation of novel strategies of analysis. FINDINGS: We describe a new program, called UVWORD, which determines the number of times that each DNA word present in a sequence (target) is found in a second sequence (source), a procedure that we have called oligonucleotide profiling. On a standard computer, the user may search for words of a size ranging from k = 1 to k = 14 nucleotides. Average counts for groups of contiguous words may also be established. The rate of analysis on standard computers is from 3.4 (k = 14) to 16 millions of words per second (1 ≤ k ≤ 8). This makes feasible the fast screening of even the longest known DNA molecules. DISCUSSION: We show that the combination of the ability of analyzing words of relatively long size, which occur very rarely by chance, and the fast speed of the program allows to perform novel types of screenings, complementary to those provided by standard programs such as BLAST. This method can be used to determine oligonucleotide content, to characterize the distribution of repetitive sequences in chromosomes, to determine the evolutionary conservation of sequences in different species, to establish regions of similar DNA among chromosomes or genomes, etc. BioMed Central 2008-02-28 /pmc/articles/PMC2518268/ /pubmed/18710530 http://dx.doi.org/10.1186/1756-0500-1-5 Text en Copyright © 2008 Arnau et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Technical Note Arnau, Vicente Gallach, Miguel Marín, Ignacio Fast comparison of DNA sequences by oligonucleotide profiling
title	Fast comparison of DNA sequences by oligonucleotide profiling
title_full	Fast comparison of DNA sequences by oligonucleotide profiling
title_fullStr	Fast comparison of DNA sequences by oligonucleotide profiling
title_full_unstemmed	Fast comparison of DNA sequences by oligonucleotide profiling
title_short	Fast comparison of DNA sequences by oligonucleotide profiling
title_sort	fast comparison of dna sequences by oligonucleotide profiling
topic	Technical Note
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2518268/ https://www.ncbi.nlm.nih.gov/pubmed/18710530 http://dx.doi.org/10.1186/1756-0500-1-5
work_keys_str_mv	AT arnauvicente fastcomparisonofdnasequencesbyoligonucleotideprofiling AT gallachmiguel fastcomparisonofdnasequencesbyoligonucleotideprofiling AT marinignacio fastcomparisonofdnasequencesbyoligonucleotideprofiling

Fast comparison of DNA sequences by oligonucleotide profiling

Ejemplares similares