Cargando…

Principal component analysis based unsupervised feature extraction applied to budding yeast temporally periodic gene expression

BACKGROUND: The recently proposed principal component analysis (PCA) based unsupervised feature extraction (FE) has successfully been applied to various bioinformatics problems ranging from biomarker identification to the screening of disease causing genes using gene expression/epigenetic profiles....

Descripción completa

Detalles Bibliográficos
Autor principal: Taguchi, Y-h
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4928327/
https://www.ncbi.nlm.nih.gov/pubmed/27366210
http://dx.doi.org/10.1186/s13040-016-0101-9
_version_ 1782440418973057024
author Taguchi, Y-h
author_facet Taguchi, Y-h
author_sort Taguchi, Y-h
collection PubMed
description BACKGROUND: The recently proposed principal component analysis (PCA) based unsupervised feature extraction (FE) has successfully been applied to various bioinformatics problems ranging from biomarker identification to the screening of disease causing genes using gene expression/epigenetic profiles. However, the conditions required for its successful use and the mechanisms involved in how it outperforms other supervised methods is unknown, because PCA based unsupervised FE has only been applied to challenging (i.e. not well known) problems. RESULTS: In this study, PCA based unsupervised FE was applied to an extensively studied organism, i.e., budding yeast. When applied to two gene expression profiles expected to be temporally periodic, yeast metabolic cycle (YMC) and yeast cell division cycle (YCDC), PCA based unsupervised FE outperformed simple but powerful conventional methods, with sinusoidal fitting with regards to several aspects: (i) feasible biological term enrichment without assuming periodicity for YMC; (ii) identification of periodic profiles whose period was half as long as the cell division cycle for YMC; and (iii) the identification of no more than 37 genes associated with the enrichment of biological terms related to cell division cycle for the integrated analysis of seven YCDC profiles, for which sinusoidal fittings failed. The explantation for differences between methods used and the necessary conditions required were determined by comparing PCA based unsupervised FE with fittings to various periodic (artificial, thus pre-defined) profiles. Furthermore, four popular unsupervised clustering algorithms applied to YMC were not as successful as PCA based unsupervised FE. CONCLUSIONS: PCA based unsupervised FE is a useful and effective unsupervised method to investigate YMC and YCDC. This study identified why the unsupervised method without pre-judged criteria outperformed supervised methods requiring human defined criteria. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13040-016-0101-9) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4928327
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-49283272016-06-30 Principal component analysis based unsupervised feature extraction applied to budding yeast temporally periodic gene expression Taguchi, Y-h BioData Min Research BACKGROUND: The recently proposed principal component analysis (PCA) based unsupervised feature extraction (FE) has successfully been applied to various bioinformatics problems ranging from biomarker identification to the screening of disease causing genes using gene expression/epigenetic profiles. However, the conditions required for its successful use and the mechanisms involved in how it outperforms other supervised methods is unknown, because PCA based unsupervised FE has only been applied to challenging (i.e. not well known) problems. RESULTS: In this study, PCA based unsupervised FE was applied to an extensively studied organism, i.e., budding yeast. When applied to two gene expression profiles expected to be temporally periodic, yeast metabolic cycle (YMC) and yeast cell division cycle (YCDC), PCA based unsupervised FE outperformed simple but powerful conventional methods, with sinusoidal fitting with regards to several aspects: (i) feasible biological term enrichment without assuming periodicity for YMC; (ii) identification of periodic profiles whose period was half as long as the cell division cycle for YMC; and (iii) the identification of no more than 37 genes associated with the enrichment of biological terms related to cell division cycle for the integrated analysis of seven YCDC profiles, for which sinusoidal fittings failed. The explantation for differences between methods used and the necessary conditions required were determined by comparing PCA based unsupervised FE with fittings to various periodic (artificial, thus pre-defined) profiles. Furthermore, four popular unsupervised clustering algorithms applied to YMC were not as successful as PCA based unsupervised FE. CONCLUSIONS: PCA based unsupervised FE is a useful and effective unsupervised method to investigate YMC and YCDC. This study identified why the unsupervised method without pre-judged criteria outperformed supervised methods requiring human defined criteria. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13040-016-0101-9) contains supplementary material, which is available to authorized users. BioMed Central 2016-06-29 /pmc/articles/PMC4928327/ /pubmed/27366210 http://dx.doi.org/10.1186/s13040-016-0101-9 Text en © The Author(s) 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Taguchi, Y-h
Principal component analysis based unsupervised feature extraction applied to budding yeast temporally periodic gene expression
title Principal component analysis based unsupervised feature extraction applied to budding yeast temporally periodic gene expression
title_full Principal component analysis based unsupervised feature extraction applied to budding yeast temporally periodic gene expression
title_fullStr Principal component analysis based unsupervised feature extraction applied to budding yeast temporally periodic gene expression
title_full_unstemmed Principal component analysis based unsupervised feature extraction applied to budding yeast temporally periodic gene expression
title_short Principal component analysis based unsupervised feature extraction applied to budding yeast temporally periodic gene expression
title_sort principal component analysis based unsupervised feature extraction applied to budding yeast temporally periodic gene expression
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4928327/
https://www.ncbi.nlm.nih.gov/pubmed/27366210
http://dx.doi.org/10.1186/s13040-016-0101-9
work_keys_str_mv AT taguchiyh principalcomponentanalysisbasedunsupervisedfeatureextractionappliedtobuddingyeasttemporallyperiodicgeneexpression