Cargando…
Correlation and Variation-Based Method for Identifying Reference Genes from Large Datasets
BACKGROUND: Reference genes are assumed to be stably expressed under most circumstances. Previous studies have shown that identification of potential reference genes using common algorithms, such as NormFinder, geNorm, and BestKeeper, are not suitable for microarray-sized datasets. The aim of this s...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Electronic physician
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4324282/ https://www.ncbi.nlm.nih.gov/pubmed/25763136 http://dx.doi.org/10.14661/2014.719-727 |
_version_ | 1782356668663726080 |
---|---|
author | Chan, Oliver Yuan Wei Keng, Bryan Ming Hsun Ling, Maurice Han Tong |
author_facet | Chan, Oliver Yuan Wei Keng, Bryan Ming Hsun Ling, Maurice Han Tong |
author_sort | Chan, Oliver Yuan Wei |
collection | PubMed |
description | BACKGROUND: Reference genes are assumed to be stably expressed under most circumstances. Previous studies have shown that identification of potential reference genes using common algorithms, such as NormFinder, geNorm, and BestKeeper, are not suitable for microarray-sized datasets. The aim of this study was to evaluate existing methods and develop methods for identifying reference genes from microarray datasets. METHODS: We evaluated the correlation between outputs from 7 published methods for identifying reference genes, including NormFinder, geNorm, and BestKeeper, using subsets of published microarray data. From these results, seven novel combinations of published methods for identifying reference genes were evaluated. RESULTS: Our results showed that NormFinder’s and geNorm’s indices had high correlations (R(2) = 0.987, P < 0.0001), which is consistent with the findings of previous studies. However, NormFinder’s and BestKeeper’s indices (R(2) = 0.489, 0.01 < P < 0.05) and NormFinder’s coefficient of variance (CV) suggested a lower correlation (R(2) = 0.483, 0.01 < P < 0.05). We developed two novel methods with high correlations with NormFinder (R(2) values of both methods were 0.796, P < 0.0001). In addition, computational times required by the two novel methods were linear with the size of the dataset. CONCLUSION: Our findings suggested that both of our novel methods can be used as alternatives to NormFinder, geNorm, and BestKeeper for identifying reference genes from large datasets. These methods were implemented as a tool, OLIgonucleotide Variable Expression Ranker (OLIVER), which can be downloaded from http://sourceforge.net/projects/bactome/files/OLIVER/OLIVER_1.zip. |
format | Online Article Text |
id | pubmed-4324282 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Electronic physician |
record_format | MEDLINE/PubMed |
spelling | pubmed-43242822015-03-11 Correlation and Variation-Based Method for Identifying Reference Genes from Large Datasets Chan, Oliver Yuan Wei Keng, Bryan Ming Hsun Ling, Maurice Han Tong Electron Physician Articles BACKGROUND: Reference genes are assumed to be stably expressed under most circumstances. Previous studies have shown that identification of potential reference genes using common algorithms, such as NormFinder, geNorm, and BestKeeper, are not suitable for microarray-sized datasets. The aim of this study was to evaluate existing methods and develop methods for identifying reference genes from microarray datasets. METHODS: We evaluated the correlation between outputs from 7 published methods for identifying reference genes, including NormFinder, geNorm, and BestKeeper, using subsets of published microarray data. From these results, seven novel combinations of published methods for identifying reference genes were evaluated. RESULTS: Our results showed that NormFinder’s and geNorm’s indices had high correlations (R(2) = 0.987, P < 0.0001), which is consistent with the findings of previous studies. However, NormFinder’s and BestKeeper’s indices (R(2) = 0.489, 0.01 < P < 0.05) and NormFinder’s coefficient of variance (CV) suggested a lower correlation (R(2) = 0.483, 0.01 < P < 0.05). We developed two novel methods with high correlations with NormFinder (R(2) values of both methods were 0.796, P < 0.0001). In addition, computational times required by the two novel methods were linear with the size of the dataset. CONCLUSION: Our findings suggested that both of our novel methods can be used as alternatives to NormFinder, geNorm, and BestKeeper for identifying reference genes from large datasets. These methods were implemented as a tool, OLIgonucleotide Variable Expression Ranker (OLIVER), which can be downloaded from http://sourceforge.net/projects/bactome/files/OLIVER/OLIVER_1.zip. Electronic physician 2014-02-01 /pmc/articles/PMC4324282/ /pubmed/25763136 http://dx.doi.org/10.14661/2014.719-727 Text en © 2014 The Authors This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License (http://creativecommons.org/licenses/by-nc-nd/3.0/) , which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made. |
spellingShingle | Articles Chan, Oliver Yuan Wei Keng, Bryan Ming Hsun Ling, Maurice Han Tong Correlation and Variation-Based Method for Identifying Reference Genes from Large Datasets |
title | Correlation and Variation-Based Method for Identifying Reference Genes from Large Datasets |
title_full | Correlation and Variation-Based Method for Identifying Reference Genes from Large Datasets |
title_fullStr | Correlation and Variation-Based Method for Identifying Reference Genes from Large Datasets |
title_full_unstemmed | Correlation and Variation-Based Method for Identifying Reference Genes from Large Datasets |
title_short | Correlation and Variation-Based Method for Identifying Reference Genes from Large Datasets |
title_sort | correlation and variation-based method for identifying reference genes from large datasets |
topic | Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4324282/ https://www.ncbi.nlm.nih.gov/pubmed/25763136 http://dx.doi.org/10.14661/2014.719-727 |
work_keys_str_mv | AT chanoliveryuanwei correlationandvariationbasedmethodforidentifyingreferencegenesfromlargedatasets AT kengbryanminghsun correlationandvariationbasedmethodforidentifyingreferencegenesfromlargedatasets AT lingmauricehantong correlationandvariationbasedmethodforidentifyingreferencegenesfromlargedatasets |