Cargando…

A fully scalable online pre-processing algorithm for short oligonucleotide microarray atlases

Rapid accumulation of large and standardized microarray data collections is opening up novel opportunities for holistic characterization of genome function. The limited scalability of current preprocessing techniques has, however, formed a bottleneck for full utilization of these data resources. Alt...

Descripción completa

Detalles Bibliográficos
Autores principales: Lahti, Leo, Torrente, Aurora, Elo, Laura L., Brazma, Alvis, Rung, Johan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3664815/
https://www.ncbi.nlm.nih.gov/pubmed/23563154
http://dx.doi.org/10.1093/nar/gkt229
_version_ 1782271168418414592
author Lahti, Leo
Torrente, Aurora
Elo, Laura L.
Brazma, Alvis
Rung, Johan
author_facet Lahti, Leo
Torrente, Aurora
Elo, Laura L.
Brazma, Alvis
Rung, Johan
author_sort Lahti, Leo
collection PubMed
description Rapid accumulation of large and standardized microarray data collections is opening up novel opportunities for holistic characterization of genome function. The limited scalability of current preprocessing techniques has, however, formed a bottleneck for full utilization of these data resources. Although short oligonucleotide arrays constitute a major source of genome-wide profiling data, scalable probe-level techniques have been available only for few platforms based on pre-calculated probe effects from restricted reference training sets. To overcome these key limitations, we introduce a fully scalable online-learning algorithm for probe-level analysis and pre-processing of large microarray atlases involving tens of thousands of arrays. In contrast to the alternatives, our algorithm scales up linearly with respect to sample size and is applicable to all short oligonucleotide platforms. The model can use the most comprehensive data collections available to date to pinpoint individual probes affected by noise and biases, providing tools to guide array design and quality control. This is the only available algorithm that can learn probe-level parameters based on sequential hyperparameter updates at small consecutive batches of data, thus circumventing the extensive memory requirements of the standard approaches and opening up novel opportunities to take full advantage of contemporary microarray collections.
format Online
Article
Text
id pubmed-3664815
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-36648152013-05-28 A fully scalable online pre-processing algorithm for short oligonucleotide microarray atlases Lahti, Leo Torrente, Aurora Elo, Laura L. Brazma, Alvis Rung, Johan Nucleic Acids Res Methods Online Rapid accumulation of large and standardized microarray data collections is opening up novel opportunities for holistic characterization of genome function. The limited scalability of current preprocessing techniques has, however, formed a bottleneck for full utilization of these data resources. Although short oligonucleotide arrays constitute a major source of genome-wide profiling data, scalable probe-level techniques have been available only for few platforms based on pre-calculated probe effects from restricted reference training sets. To overcome these key limitations, we introduce a fully scalable online-learning algorithm for probe-level analysis and pre-processing of large microarray atlases involving tens of thousands of arrays. In contrast to the alternatives, our algorithm scales up linearly with respect to sample size and is applicable to all short oligonucleotide platforms. The model can use the most comprehensive data collections available to date to pinpoint individual probes affected by noise and biases, providing tools to guide array design and quality control. This is the only available algorithm that can learn probe-level parameters based on sequential hyperparameter updates at small consecutive batches of data, thus circumventing the extensive memory requirements of the standard approaches and opening up novel opportunities to take full advantage of contemporary microarray collections. Oxford University Press 2013-05 2013-04-05 /pmc/articles/PMC3664815/ /pubmed/23563154 http://dx.doi.org/10.1093/nar/gkt229 Text en © The Author(s) 2013. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Lahti, Leo
Torrente, Aurora
Elo, Laura L.
Brazma, Alvis
Rung, Johan
A fully scalable online pre-processing algorithm for short oligonucleotide microarray atlases
title A fully scalable online pre-processing algorithm for short oligonucleotide microarray atlases
title_full A fully scalable online pre-processing algorithm for short oligonucleotide microarray atlases
title_fullStr A fully scalable online pre-processing algorithm for short oligonucleotide microarray atlases
title_full_unstemmed A fully scalable online pre-processing algorithm for short oligonucleotide microarray atlases
title_short A fully scalable online pre-processing algorithm for short oligonucleotide microarray atlases
title_sort fully scalable online pre-processing algorithm for short oligonucleotide microarray atlases
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3664815/
https://www.ncbi.nlm.nih.gov/pubmed/23563154
http://dx.doi.org/10.1093/nar/gkt229
work_keys_str_mv AT lahtileo afullyscalableonlinepreprocessingalgorithmforshortoligonucleotidemicroarrayatlases
AT torrenteaurora afullyscalableonlinepreprocessingalgorithmforshortoligonucleotidemicroarrayatlases
AT elolaural afullyscalableonlinepreprocessingalgorithmforshortoligonucleotidemicroarrayatlases
AT brazmaalvis afullyscalableonlinepreprocessingalgorithmforshortoligonucleotidemicroarrayatlases
AT rungjohan afullyscalableonlinepreprocessingalgorithmforshortoligonucleotidemicroarrayatlases
AT lahtileo fullyscalableonlinepreprocessingalgorithmforshortoligonucleotidemicroarrayatlases
AT torrenteaurora fullyscalableonlinepreprocessingalgorithmforshortoligonucleotidemicroarrayatlases
AT elolaural fullyscalableonlinepreprocessingalgorithmforshortoligonucleotidemicroarrayatlases
AT brazmaalvis fullyscalableonlinepreprocessingalgorithmforshortoligonucleotidemicroarrayatlases
AT rungjohan fullyscalableonlinepreprocessingalgorithmforshortoligonucleotidemicroarrayatlases