Cargando…

Optimized design and assessment of whole genome tiling arrays

MOTIVATION: Recent advances in microarray technologies have made it feasible to interrogate whole genomes with tiling arrays and this technique is rapidly becoming one of the most important high-throughput functional genomics assays. For large mammalian genomes, analyzing oligonucleotide tiling arra...

Descripción completa

Detalles Bibliográficos
Autores principales: Gräf, Stefan, Nielsen, Fiona G. G., Kurtz, Stefan, Huynen, Martijn A., Birney, Ewan, Stunnenberg, Henk, Flicek, Paul
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5892713/
https://www.ncbi.nlm.nih.gov/pubmed/17646297
http://dx.doi.org/10.1093/bioinformatics/btm200
_version_ 1783313201838948352
author Gräf, Stefan
Nielsen, Fiona G. G.
Kurtz, Stefan
Huynen, Martijn A.
Birney, Ewan
Stunnenberg, Henk
Flicek, Paul
author_facet Gräf, Stefan
Nielsen, Fiona G. G.
Kurtz, Stefan
Huynen, Martijn A.
Birney, Ewan
Stunnenberg, Henk
Flicek, Paul
author_sort Gräf, Stefan
collection PubMed
description MOTIVATION: Recent advances in microarray technologies have made it feasible to interrogate whole genomes with tiling arrays and this technique is rapidly becoming one of the most important high-throughput functional genomics assays. For large mammalian genomes, analyzing oligonucleotide tiling array data is complicated by the presence of non-unique sequences on the array, which increases the overall noise in the data and may lead to false positive results due to cross-hybridization. The ability to create custom microarrays using maskless array synthesis has led us to consider ways to optimize array design characteristics for improving data quality and analysis. We have identified a number of design parameters to be optimized including uniqueness of the probe sequences within the whole genome, melting temperature and self-hybridization potential. RESULTS: We introduce the uniqueness score, U, a novel quality measure for oligonucleotide probes and present a method to quickly compute it. We show that U is equivalent to the number of shortest unique substrings in the probe and describe an efficient greedy algorithm to design mammalian whole genome tiling arrays using probes that maximize U. Using the mouse genome, we demonstrate how several optimizations influence the tiling array design characteristics. With a sensible set of parameters, our designs cover 78% of the mouse genome including many regions previously considered ‘untilable’ due to the presence of repetitive sequence. Finally, we compare our whole genome tiling array designs with commercially available designs. AVAILABILITY: Source code is available under an open source license from http://www.ebi.ac.uk/~graef/arraydesign/
format Online
Article
Text
id pubmed-5892713
institution National Center for Biotechnology Information
language English
publishDate 2007
record_format MEDLINE/PubMed
spelling pubmed-58927132018-04-10 Optimized design and assessment of whole genome tiling arrays Gräf, Stefan Nielsen, Fiona G. G. Kurtz, Stefan Huynen, Martijn A. Birney, Ewan Stunnenberg, Henk Flicek, Paul Bioinformatics Article MOTIVATION: Recent advances in microarray technologies have made it feasible to interrogate whole genomes with tiling arrays and this technique is rapidly becoming one of the most important high-throughput functional genomics assays. For large mammalian genomes, analyzing oligonucleotide tiling array data is complicated by the presence of non-unique sequences on the array, which increases the overall noise in the data and may lead to false positive results due to cross-hybridization. The ability to create custom microarrays using maskless array synthesis has led us to consider ways to optimize array design characteristics for improving data quality and analysis. We have identified a number of design parameters to be optimized including uniqueness of the probe sequences within the whole genome, melting temperature and self-hybridization potential. RESULTS: We introduce the uniqueness score, U, a novel quality measure for oligonucleotide probes and present a method to quickly compute it. We show that U is equivalent to the number of shortest unique substrings in the probe and describe an efficient greedy algorithm to design mammalian whole genome tiling arrays using probes that maximize U. Using the mouse genome, we demonstrate how several optimizations influence the tiling array design characteristics. With a sensible set of parameters, our designs cover 78% of the mouse genome including many regions previously considered ‘untilable’ due to the presence of repetitive sequence. Finally, we compare our whole genome tiling array designs with commercially available designs. AVAILABILITY: Source code is available under an open source license from http://www.ebi.ac.uk/~graef/arraydesign/ 2007-07-01 /pmc/articles/PMC5892713/ /pubmed/17646297 http://dx.doi.org/10.1093/bioinformatics/btm200 Text en http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Article
Gräf, Stefan
Nielsen, Fiona G. G.
Kurtz, Stefan
Huynen, Martijn A.
Birney, Ewan
Stunnenberg, Henk
Flicek, Paul
Optimized design and assessment of whole genome tiling arrays
title Optimized design and assessment of whole genome tiling arrays
title_full Optimized design and assessment of whole genome tiling arrays
title_fullStr Optimized design and assessment of whole genome tiling arrays
title_full_unstemmed Optimized design and assessment of whole genome tiling arrays
title_short Optimized design and assessment of whole genome tiling arrays
title_sort optimized design and assessment of whole genome tiling arrays
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5892713/
https://www.ncbi.nlm.nih.gov/pubmed/17646297
http://dx.doi.org/10.1093/bioinformatics/btm200
work_keys_str_mv AT grafstefan optimizeddesignandassessmentofwholegenometilingarrays
AT nielsenfionagg optimizeddesignandassessmentofwholegenometilingarrays
AT kurtzstefan optimizeddesignandassessmentofwholegenometilingarrays
AT huynenmartijna optimizeddesignandassessmentofwholegenometilingarrays
AT birneyewan optimizeddesignandassessmentofwholegenometilingarrays
AT stunnenberghenk optimizeddesignandassessmentofwholegenometilingarrays
AT flicekpaul optimizeddesignandassessmentofwholegenometilingarrays