Cargando…

Individual sequences in large sets of gene sequences may be distinguished efficiently by combinations of shared sub-sequences

BACKGROUND: Most current DNA diagnostic tests for identifying organisms use specific oligonucleotide probes that are complementary in sequence to, and hence only hybridise with the DNA of one target species. By contrast, in traditional taxonomy, specimens are usually identified by 'dichotomous...

Descripción completa

Detalles Bibliográficos
Autores principales:	Gibbs, Mark J, Armstrong, John S, Gibbs, Adrian J
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2005
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1090557/ https://www.ncbi.nlm.nih.gov/pubmed/15817134 http://dx.doi.org/10.1186/1471-2105-6-90

_version_	1782123884573622272
author	Gibbs, Mark J Armstrong, John S Gibbs, Adrian J
author_facet	Gibbs, Mark J Armstrong, John S Gibbs, Adrian J
author_sort	Gibbs, Mark J
collection	PubMed
description	BACKGROUND: Most current DNA diagnostic tests for identifying organisms use specific oligonucleotide probes that are complementary in sequence to, and hence only hybridise with the DNA of one target species. By contrast, in traditional taxonomy, specimens are usually identified by 'dichotomous keys' that use combinations of characters shared by different members of the target set. Using one specific character for each target is the least efficient strategy for identification. Using combinations of shared bisectionally-distributed characters is much more efficient, and this strategy is most efficient when they separate the targets in a progressively binary way. RESULTS: We have developed a practical method for finding minimal sets of sub-sequences that identify individual sequences, and could be targeted by combinations of probes, so that the efficient strategy of traditional taxonomic identification could be used in DNA diagnosis. The sizes of minimal sub-sequence sets depended mostly on sequence diversity and sub-sequence length and interactions between these parameters. We found that 201 distinct cytochrome oxidase subunit-1 (CO1) genes from moths (Lepidoptera) were distinguished using only 15 sub-sequences 20 nucleotides long, whereas only 8–10 sub-sequences 6–10 nucleotides long were required to distinguish the CO1 genes of 92 species from the 9 largest orders of insects. CONCLUSION: The presence/absence of sub-sequences in a set of gene sequences can be used like the questions in a traditional dichotomous taxonomic key; hybridisation probes complementary to such sub-sequences should provide a very efficient means for identifying individual species, subtypes or genotypes. Sequence diversity and sub-sequence length are the major factors that determine the numbers of distinguishing sub-sequences in any set of sequences.
format	Text
id	pubmed-1090557
institution	National Center for Biotechnology Information
language	English
publishDate	2005
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-10905572005-05-07 Individual sequences in large sets of gene sequences may be distinguished efficiently by combinations of shared sub-sequences Gibbs, Mark J Armstrong, John S Gibbs, Adrian J BMC Bioinformatics Methodology Article BACKGROUND: Most current DNA diagnostic tests for identifying organisms use specific oligonucleotide probes that are complementary in sequence to, and hence only hybridise with the DNA of one target species. By contrast, in traditional taxonomy, specimens are usually identified by 'dichotomous keys' that use combinations of characters shared by different members of the target set. Using one specific character for each target is the least efficient strategy for identification. Using combinations of shared bisectionally-distributed characters is much more efficient, and this strategy is most efficient when they separate the targets in a progressively binary way. RESULTS: We have developed a practical method for finding minimal sets of sub-sequences that identify individual sequences, and could be targeted by combinations of probes, so that the efficient strategy of traditional taxonomic identification could be used in DNA diagnosis. The sizes of minimal sub-sequence sets depended mostly on sequence diversity and sub-sequence length and interactions between these parameters. We found that 201 distinct cytochrome oxidase subunit-1 (CO1) genes from moths (Lepidoptera) were distinguished using only 15 sub-sequences 20 nucleotides long, whereas only 8–10 sub-sequences 6–10 nucleotides long were required to distinguish the CO1 genes of 92 species from the 9 largest orders of insects. CONCLUSION: The presence/absence of sub-sequences in a set of gene sequences can be used like the questions in a traditional dichotomous taxonomic key; hybridisation probes complementary to such sub-sequences should provide a very efficient means for identifying individual species, subtypes or genotypes. Sequence diversity and sub-sequence length are the major factors that determine the numbers of distinguishing sub-sequences in any set of sequences. BioMed Central 2005-04-08 /pmc/articles/PMC1090557/ /pubmed/15817134 http://dx.doi.org/10.1186/1471-2105-6-90 Text en Copyright © 2005 Gibbs et al; licensee BioMed Central Ltd.
spellingShingle	Methodology Article Gibbs, Mark J Armstrong, John S Gibbs, Adrian J Individual sequences in large sets of gene sequences may be distinguished efficiently by combinations of shared sub-sequences
title	Individual sequences in large sets of gene sequences may be distinguished efficiently by combinations of shared sub-sequences
title_full	Individual sequences in large sets of gene sequences may be distinguished efficiently by combinations of shared sub-sequences
title_fullStr	Individual sequences in large sets of gene sequences may be distinguished efficiently by combinations of shared sub-sequences
title_full_unstemmed	Individual sequences in large sets of gene sequences may be distinguished efficiently by combinations of shared sub-sequences
title_short	Individual sequences in large sets of gene sequences may be distinguished efficiently by combinations of shared sub-sequences
title_sort	individual sequences in large sets of gene sequences may be distinguished efficiently by combinations of shared sub-sequences
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1090557/ https://www.ncbi.nlm.nih.gov/pubmed/15817134 http://dx.doi.org/10.1186/1471-2105-6-90
work_keys_str_mv	AT gibbsmarkj individualsequencesinlargesetsofgenesequencesmaybedistinguishedefficientlybycombinationsofsharedsubsequences AT armstrongjohns individualsequencesinlargesetsofgenesequencesmaybedistinguishedefficientlybycombinationsofsharedsubsequences AT gibbsadrianj individualsequencesinlargesetsofgenesequencesmaybedistinguishedefficientlybycombinationsofsharedsubsequences

Individual sequences in large sets of gene sequences may be distinguished efficiently by combinations of shared sub-sequences

Ejemplares similares