Cargando…

DBNorm: normalizing high-density oligonucleotide microarray data based on distributions

BACKGROUND: Data from patients with rare diseases is often produced using different platforms and probe sets because patients are widely distributed in space and time. Aggregating such data requires a method of normalization that makes patient records comparable. RESULTS: This paper proposed DBNorm,...

Descripción completa

Detalles Bibliográficos
Autores principales:	Meng, Qinxue, Catchpoole, Daniel, Skillicorn, David, Kennedy, Paul J.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2017
Materias:	Software
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5706403/ https://www.ncbi.nlm.nih.gov/pubmed/29187149 http://dx.doi.org/10.1186/s12859-017-1912-5

_version_	1783282223407955968
author	Meng, Qinxue Catchpoole, Daniel Skillicorn, David Kennedy, Paul J.
author_facet	Meng, Qinxue Catchpoole, Daniel Skillicorn, David Kennedy, Paul J.
author_sort	Meng, Qinxue
collection	PubMed
description	BACKGROUND: Data from patients with rare diseases is often produced using different platforms and probe sets because patients are widely distributed in space and time. Aggregating such data requires a method of normalization that makes patient records comparable. RESULTS: This paper proposed DBNorm, implemented as an R package, is an algorithm that normalizes arbitrarily distributed data to a common, comparable form. Specifically, DBNorm merges data distributions by fitting functions to each of them, and using the probability of each element drawn from the fitted distribution to merge it into a global distribution. DBNorm contains state-of-the-art fitting functions including Polynomial, Fourier and Gaussian distributions, and also allows users to define their own fitting functions if required. CONCLUSIONS: The performance of DBNorm is compared with z-score, average difference, quantile normalization and ComBat on a set of datasets, including several that are publically available. The performance of these normalization methods are compared using statistics, visualization, and classification when class labels are known based on a number of self-generated and public microarray datasets. The experimental results show that DBNorm achieves better normalization results than conventional methods. Finally, the approach has the potential to be applicable outside bioinformatics analysis. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1912-5) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-5706403
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-57064032017-12-06 DBNorm: normalizing high-density oligonucleotide microarray data based on distributions Meng, Qinxue Catchpoole, Daniel Skillicorn, David Kennedy, Paul J. BMC Bioinformatics Software BACKGROUND: Data from patients with rare diseases is often produced using different platforms and probe sets because patients are widely distributed in space and time. Aggregating such data requires a method of normalization that makes patient records comparable. RESULTS: This paper proposed DBNorm, implemented as an R package, is an algorithm that normalizes arbitrarily distributed data to a common, comparable form. Specifically, DBNorm merges data distributions by fitting functions to each of them, and using the probability of each element drawn from the fitted distribution to merge it into a global distribution. DBNorm contains state-of-the-art fitting functions including Polynomial, Fourier and Gaussian distributions, and also allows users to define their own fitting functions if required. CONCLUSIONS: The performance of DBNorm is compared with z-score, average difference, quantile normalization and ComBat on a set of datasets, including several that are publically available. The performance of these normalization methods are compared using statistics, visualization, and classification when class labels are known based on a number of self-generated and public microarray datasets. The experimental results show that DBNorm achieves better normalization results than conventional methods. Finally, the approach has the potential to be applicable outside bioinformatics analysis. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1912-5) contains supplementary material, which is available to authorized users. BioMed Central 2017-11-29 /pmc/articles/PMC5706403/ /pubmed/29187149 http://dx.doi.org/10.1186/s12859-017-1912-5 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Software Meng, Qinxue Catchpoole, Daniel Skillicorn, David Kennedy, Paul J. DBNorm: normalizing high-density oligonucleotide microarray data based on distributions
title	DBNorm: normalizing high-density oligonucleotide microarray data based on distributions
title_full	DBNorm: normalizing high-density oligonucleotide microarray data based on distributions
title_fullStr	DBNorm: normalizing high-density oligonucleotide microarray data based on distributions
title_full_unstemmed	DBNorm: normalizing high-density oligonucleotide microarray data based on distributions
title_short	DBNorm: normalizing high-density oligonucleotide microarray data based on distributions
title_sort	dbnorm: normalizing high-density oligonucleotide microarray data based on distributions
topic	Software
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5706403/ https://www.ncbi.nlm.nih.gov/pubmed/29187149 http://dx.doi.org/10.1186/s12859-017-1912-5
work_keys_str_mv	AT mengqinxue dbnormnormalizinghighdensityoligonucleotidemicroarraydatabasedondistributions AT catchpooledaniel dbnormnormalizinghighdensityoligonucleotidemicroarraydatabasedondistributions AT skillicorndavid dbnormnormalizinghighdensityoligonucleotidemicroarraydatabasedondistributions AT kennedypaulj dbnormnormalizinghighdensityoligonucleotidemicroarraydatabasedondistributions

DBNorm: normalizing high-density oligonucleotide microarray data based on distributions

Ejemplares similares