Cargando…
Principal component analysis based unsupervised feature extraction applied to budding yeast temporally periodic gene expression
BACKGROUND: The recently proposed principal component analysis (PCA) based unsupervised feature extraction (FE) has successfully been applied to various bioinformatics problems ranging from biomarker identification to the screening of disease causing genes using gene expression/epigenetic profiles....
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4928327/ https://www.ncbi.nlm.nih.gov/pubmed/27366210 http://dx.doi.org/10.1186/s13040-016-0101-9 |
_version_ | 1782440418973057024 |
---|---|
author | Taguchi, Y-h |
author_facet | Taguchi, Y-h |
author_sort | Taguchi, Y-h |
collection | PubMed |
description | BACKGROUND: The recently proposed principal component analysis (PCA) based unsupervised feature extraction (FE) has successfully been applied to various bioinformatics problems ranging from biomarker identification to the screening of disease causing genes using gene expression/epigenetic profiles. However, the conditions required for its successful use and the mechanisms involved in how it outperforms other supervised methods is unknown, because PCA based unsupervised FE has only been applied to challenging (i.e. not well known) problems. RESULTS: In this study, PCA based unsupervised FE was applied to an extensively studied organism, i.e., budding yeast. When applied to two gene expression profiles expected to be temporally periodic, yeast metabolic cycle (YMC) and yeast cell division cycle (YCDC), PCA based unsupervised FE outperformed simple but powerful conventional methods, with sinusoidal fitting with regards to several aspects: (i) feasible biological term enrichment without assuming periodicity for YMC; (ii) identification of periodic profiles whose period was half as long as the cell division cycle for YMC; and (iii) the identification of no more than 37 genes associated with the enrichment of biological terms related to cell division cycle for the integrated analysis of seven YCDC profiles, for which sinusoidal fittings failed. The explantation for differences between methods used and the necessary conditions required were determined by comparing PCA based unsupervised FE with fittings to various periodic (artificial, thus pre-defined) profiles. Furthermore, four popular unsupervised clustering algorithms applied to YMC were not as successful as PCA based unsupervised FE. CONCLUSIONS: PCA based unsupervised FE is a useful and effective unsupervised method to investigate YMC and YCDC. This study identified why the unsupervised method without pre-judged criteria outperformed supervised methods requiring human defined criteria. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13040-016-0101-9) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4928327 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-49283272016-06-30 Principal component analysis based unsupervised feature extraction applied to budding yeast temporally periodic gene expression Taguchi, Y-h BioData Min Research BACKGROUND: The recently proposed principal component analysis (PCA) based unsupervised feature extraction (FE) has successfully been applied to various bioinformatics problems ranging from biomarker identification to the screening of disease causing genes using gene expression/epigenetic profiles. However, the conditions required for its successful use and the mechanisms involved in how it outperforms other supervised methods is unknown, because PCA based unsupervised FE has only been applied to challenging (i.e. not well known) problems. RESULTS: In this study, PCA based unsupervised FE was applied to an extensively studied organism, i.e., budding yeast. When applied to two gene expression profiles expected to be temporally periodic, yeast metabolic cycle (YMC) and yeast cell division cycle (YCDC), PCA based unsupervised FE outperformed simple but powerful conventional methods, with sinusoidal fitting with regards to several aspects: (i) feasible biological term enrichment without assuming periodicity for YMC; (ii) identification of periodic profiles whose period was half as long as the cell division cycle for YMC; and (iii) the identification of no more than 37 genes associated with the enrichment of biological terms related to cell division cycle for the integrated analysis of seven YCDC profiles, for which sinusoidal fittings failed. The explantation for differences between methods used and the necessary conditions required were determined by comparing PCA based unsupervised FE with fittings to various periodic (artificial, thus pre-defined) profiles. Furthermore, four popular unsupervised clustering algorithms applied to YMC were not as successful as PCA based unsupervised FE. CONCLUSIONS: PCA based unsupervised FE is a useful and effective unsupervised method to investigate YMC and YCDC. This study identified why the unsupervised method without pre-judged criteria outperformed supervised methods requiring human defined criteria. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13040-016-0101-9) contains supplementary material, which is available to authorized users. BioMed Central 2016-06-29 /pmc/articles/PMC4928327/ /pubmed/27366210 http://dx.doi.org/10.1186/s13040-016-0101-9 Text en © The Author(s) 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Taguchi, Y-h Principal component analysis based unsupervised feature extraction applied to budding yeast temporally periodic gene expression |
title | Principal component analysis based unsupervised feature extraction applied to budding yeast temporally periodic gene expression |
title_full | Principal component analysis based unsupervised feature extraction applied to budding yeast temporally periodic gene expression |
title_fullStr | Principal component analysis based unsupervised feature extraction applied to budding yeast temporally periodic gene expression |
title_full_unstemmed | Principal component analysis based unsupervised feature extraction applied to budding yeast temporally periodic gene expression |
title_short | Principal component analysis based unsupervised feature extraction applied to budding yeast temporally periodic gene expression |
title_sort | principal component analysis based unsupervised feature extraction applied to budding yeast temporally periodic gene expression |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4928327/ https://www.ncbi.nlm.nih.gov/pubmed/27366210 http://dx.doi.org/10.1186/s13040-016-0101-9 |
work_keys_str_mv | AT taguchiyh principalcomponentanalysisbasedunsupervisedfeatureextractionappliedtobuddingyeasttemporallyperiodicgeneexpression |