Cargando…

Efficient oligonucleotide probe selection for pan-genomic tiling arrays

BACKGROUND: Array comparative genomic hybridization is a fast and cost-effective method for detecting, genotyping, and comparing the genomic sequence of unknown bacterial isolates. This method, as with all microarray applications, requires adequate coverage of probes targeting the regions of interes...

Descripción completa

Detalles Bibliográficos
Autores principales: Phillippy, Adam M, Deng, Xiangyu, Zhang, Wei, Salzberg, Steven L
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2753849/
https://www.ncbi.nlm.nih.gov/pubmed/19758451
http://dx.doi.org/10.1186/1471-2105-10-293
_version_ 1782172368074964992
author Phillippy, Adam M
Deng, Xiangyu
Zhang, Wei
Salzberg, Steven L
author_facet Phillippy, Adam M
Deng, Xiangyu
Zhang, Wei
Salzberg, Steven L
author_sort Phillippy, Adam M
collection PubMed
description BACKGROUND: Array comparative genomic hybridization is a fast and cost-effective method for detecting, genotyping, and comparing the genomic sequence of unknown bacterial isolates. This method, as with all microarray applications, requires adequate coverage of probes targeting the regions of interest. An unbiased tiling of probes across the entire length of the genome is the most flexible design approach. However, such a whole-genome tiling requires that the genome sequence is known in advance. For the accurate analysis of uncharacterized bacteria, an array must query a fully representative set of sequences from the species' pan-genome. Prior microarrays have included only a single strain per array or the conserved sequences of gene families. These arrays omit potentially important genes and sequence variants from the pan-genome. RESULTS: This paper presents a new probe selection algorithm (PanArray) that can tile multiple whole genomes using a minimal number of probes. Unlike arrays built on clustered gene families, PanArray uses an unbiased, probe-centric approach that does not rely on annotations, gene clustering, or multi-alignments. Instead, probes are evenly tiled across all sequences of the pan-genome at a consistent level of coverage. To minimize the required number of probes, probes conserved across multiple strains in the pan-genome are selected first, and additional probes are used only where necessary to span polymorphic regions of the genome. The viability of the algorithm is demonstrated by array designs for seven different bacterial pan-genomes and, in particular, the design of a 385,000 probe array that fully tiles the genomes of 20 different Listeria monocytogenes strains with overlapping probes at greater than twofold coverage. CONCLUSION: PanArray is an oligonucleotide probe selection algorithm for tiling multiple genome sequences using a minimal number of probes. It is capable of fully tiling all genomes of a species on a single microarray chip. These unique pan-genome tiling arrays provide maximum flexibility for the analysis of both known and uncharacterized strains.
format Text
id pubmed-2753849
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27538492009-09-30 Efficient oligonucleotide probe selection for pan-genomic tiling arrays Phillippy, Adam M Deng, Xiangyu Zhang, Wei Salzberg, Steven L BMC Bioinformatics Methodology Article BACKGROUND: Array comparative genomic hybridization is a fast and cost-effective method for detecting, genotyping, and comparing the genomic sequence of unknown bacterial isolates. This method, as with all microarray applications, requires adequate coverage of probes targeting the regions of interest. An unbiased tiling of probes across the entire length of the genome is the most flexible design approach. However, such a whole-genome tiling requires that the genome sequence is known in advance. For the accurate analysis of uncharacterized bacteria, an array must query a fully representative set of sequences from the species' pan-genome. Prior microarrays have included only a single strain per array or the conserved sequences of gene families. These arrays omit potentially important genes and sequence variants from the pan-genome. RESULTS: This paper presents a new probe selection algorithm (PanArray) that can tile multiple whole genomes using a minimal number of probes. Unlike arrays built on clustered gene families, PanArray uses an unbiased, probe-centric approach that does not rely on annotations, gene clustering, or multi-alignments. Instead, probes are evenly tiled across all sequences of the pan-genome at a consistent level of coverage. To minimize the required number of probes, probes conserved across multiple strains in the pan-genome are selected first, and additional probes are used only where necessary to span polymorphic regions of the genome. The viability of the algorithm is demonstrated by array designs for seven different bacterial pan-genomes and, in particular, the design of a 385,000 probe array that fully tiles the genomes of 20 different Listeria monocytogenes strains with overlapping probes at greater than twofold coverage. CONCLUSION: PanArray is an oligonucleotide probe selection algorithm for tiling multiple genome sequences using a minimal number of probes. It is capable of fully tiling all genomes of a species on a single microarray chip. These unique pan-genome tiling arrays provide maximum flexibility for the analysis of both known and uncharacterized strains. BioMed Central 2009-09-16 /pmc/articles/PMC2753849/ /pubmed/19758451 http://dx.doi.org/10.1186/1471-2105-10-293 Text en Copyright ©2009 Phillippy et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Phillippy, Adam M
Deng, Xiangyu
Zhang, Wei
Salzberg, Steven L
Efficient oligonucleotide probe selection for pan-genomic tiling arrays
title Efficient oligonucleotide probe selection for pan-genomic tiling arrays
title_full Efficient oligonucleotide probe selection for pan-genomic tiling arrays
title_fullStr Efficient oligonucleotide probe selection for pan-genomic tiling arrays
title_full_unstemmed Efficient oligonucleotide probe selection for pan-genomic tiling arrays
title_short Efficient oligonucleotide probe selection for pan-genomic tiling arrays
title_sort efficient oligonucleotide probe selection for pan-genomic tiling arrays
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2753849/
https://www.ncbi.nlm.nih.gov/pubmed/19758451
http://dx.doi.org/10.1186/1471-2105-10-293
work_keys_str_mv AT phillippyadamm efficientoligonucleotideprobeselectionforpangenomictilingarrays
AT dengxiangyu efficientoligonucleotideprobeselectionforpangenomictilingarrays
AT zhangwei efficientoligonucleotideprobeselectionforpangenomictilingarrays
AT salzbergstevenl efficientoligonucleotideprobeselectionforpangenomictilingarrays