Cargando…

A Bootstrap Based Measure Robust to the Choice of Normalization Methods for Detecting Rhythmic Features in High Dimensional Data

Motivation: Gene-expression data obtained from high throughput technologies are subject to various sources of noise and accordingly the raw data are pre-processed before formally analyzed. Normalization of the data is a key pre-processing step, since it removes systematic variations across arrays. T...

Descripción completa

Detalles Bibliográficos
Autores principales: Larriba, Yolanda, Rueda, Cristina, Fernández, Miguel A., Peddada, Shyamal D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5801422/
https://www.ncbi.nlm.nih.gov/pubmed/29456555
http://dx.doi.org/10.3389/fgene.2018.00024
_version_ 1783298347129372672
author Larriba, Yolanda
Rueda, Cristina
Fernández, Miguel A.
Peddada, Shyamal D.
author_facet Larriba, Yolanda
Rueda, Cristina
Fernández, Miguel A.
Peddada, Shyamal D.
author_sort Larriba, Yolanda
collection PubMed
description Motivation: Gene-expression data obtained from high throughput technologies are subject to various sources of noise and accordingly the raw data are pre-processed before formally analyzed. Normalization of the data is a key pre-processing step, since it removes systematic variations across arrays. There are numerous normalization methods available in the literature. Based on our experience, in the context of oscillatory systems, such as cell-cycle, circadian clock, etc., the choice of the normalization method may substantially impact the determination of a gene to be rhythmic. Thus rhythmicity of a gene can purely be an artifact of how the data were normalized. Since the determination of rhythmic genes is an important component of modern toxicological and pharmacological studies, it is important to determine truly rhythmic genes that are robust to the choice of a normalization method. Results: In this paper we introduce a rhythmicity measure and a bootstrap methodology to detect rhythmic genes in an oscillatory system. Although the proposed methodology can be used for any high-throughput gene expression data, in this paper we illustrate the proposed methodology using several publicly available circadian clock microarray gene-expression datasets. We demonstrate that the choice of normalization method has very little effect on the proposed methodology. Specifically, for any pair of normalization methods considered in this paper, the resulting values of the rhythmicity measure are highly correlated. Thus it suggests that the proposed measure is robust to the choice of a normalization method. Consequently, the rhythmicity of a gene is potentially not a mere artifact of the normalization method used. Lastly, as demonstrated in the paper, the proposed bootstrap methodology can also be used for simulating data for genes participating in an oscillatory system using a reference dataset. Availability: A user friendly code implemented in R language can be downloaded from http://www.eio.uva.es/~miguel/robustdetectionprocedure.html
format Online
Article
Text
id pubmed-5801422
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-58014222018-02-16 A Bootstrap Based Measure Robust to the Choice of Normalization Methods for Detecting Rhythmic Features in High Dimensional Data Larriba, Yolanda Rueda, Cristina Fernández, Miguel A. Peddada, Shyamal D. Front Genet Genetics Motivation: Gene-expression data obtained from high throughput technologies are subject to various sources of noise and accordingly the raw data are pre-processed before formally analyzed. Normalization of the data is a key pre-processing step, since it removes systematic variations across arrays. There are numerous normalization methods available in the literature. Based on our experience, in the context of oscillatory systems, such as cell-cycle, circadian clock, etc., the choice of the normalization method may substantially impact the determination of a gene to be rhythmic. Thus rhythmicity of a gene can purely be an artifact of how the data were normalized. Since the determination of rhythmic genes is an important component of modern toxicological and pharmacological studies, it is important to determine truly rhythmic genes that are robust to the choice of a normalization method. Results: In this paper we introduce a rhythmicity measure and a bootstrap methodology to detect rhythmic genes in an oscillatory system. Although the proposed methodology can be used for any high-throughput gene expression data, in this paper we illustrate the proposed methodology using several publicly available circadian clock microarray gene-expression datasets. We demonstrate that the choice of normalization method has very little effect on the proposed methodology. Specifically, for any pair of normalization methods considered in this paper, the resulting values of the rhythmicity measure are highly correlated. Thus it suggests that the proposed measure is robust to the choice of a normalization method. Consequently, the rhythmicity of a gene is potentially not a mere artifact of the normalization method used. Lastly, as demonstrated in the paper, the proposed bootstrap methodology can also be used for simulating data for genes participating in an oscillatory system using a reference dataset. Availability: A user friendly code implemented in R language can be downloaded from http://www.eio.uva.es/~miguel/robustdetectionprocedure.html Frontiers Media S.A. 2018-02-02 /pmc/articles/PMC5801422/ /pubmed/29456555 http://dx.doi.org/10.3389/fgene.2018.00024 Text en Copyright © 2018 Larriba, Rueda, Fernández and Peddada. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Larriba, Yolanda
Rueda, Cristina
Fernández, Miguel A.
Peddada, Shyamal D.
A Bootstrap Based Measure Robust to the Choice of Normalization Methods for Detecting Rhythmic Features in High Dimensional Data
title A Bootstrap Based Measure Robust to the Choice of Normalization Methods for Detecting Rhythmic Features in High Dimensional Data
title_full A Bootstrap Based Measure Robust to the Choice of Normalization Methods for Detecting Rhythmic Features in High Dimensional Data
title_fullStr A Bootstrap Based Measure Robust to the Choice of Normalization Methods for Detecting Rhythmic Features in High Dimensional Data
title_full_unstemmed A Bootstrap Based Measure Robust to the Choice of Normalization Methods for Detecting Rhythmic Features in High Dimensional Data
title_short A Bootstrap Based Measure Robust to the Choice of Normalization Methods for Detecting Rhythmic Features in High Dimensional Data
title_sort bootstrap based measure robust to the choice of normalization methods for detecting rhythmic features in high dimensional data
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5801422/
https://www.ncbi.nlm.nih.gov/pubmed/29456555
http://dx.doi.org/10.3389/fgene.2018.00024
work_keys_str_mv AT larribayolanda abootstrapbasedmeasurerobusttothechoiceofnormalizationmethodsfordetectingrhythmicfeaturesinhighdimensionaldata
AT ruedacristina abootstrapbasedmeasurerobusttothechoiceofnormalizationmethodsfordetectingrhythmicfeaturesinhighdimensionaldata
AT fernandezmiguela abootstrapbasedmeasurerobusttothechoiceofnormalizationmethodsfordetectingrhythmicfeaturesinhighdimensionaldata
AT peddadashyamald abootstrapbasedmeasurerobusttothechoiceofnormalizationmethodsfordetectingrhythmicfeaturesinhighdimensionaldata
AT larribayolanda bootstrapbasedmeasurerobusttothechoiceofnormalizationmethodsfordetectingrhythmicfeaturesinhighdimensionaldata
AT ruedacristina bootstrapbasedmeasurerobusttothechoiceofnormalizationmethodsfordetectingrhythmicfeaturesinhighdimensionaldata
AT fernandezmiguela bootstrapbasedmeasurerobusttothechoiceofnormalizationmethodsfordetectingrhythmicfeaturesinhighdimensionaldata
AT peddadashyamald bootstrapbasedmeasurerobusttothechoiceofnormalizationmethodsfordetectingrhythmicfeaturesinhighdimensionaldata