Cargando…

Learning Biomarkers of Pluripotent Stem Cells in Mouse

Pluripotent stem cells are able to self-renew, and to differentiate into all adult cell types. Many studies report data describing these cells, and characterize them in molecular terms. Machine learning yields classifiers that can accurately identify pluripotent stem cells, but there is a lack of st...

Descripción completa

Detalles Bibliográficos
Autores principales: Scheubert, Lena, Schmidt, Rainer, Repsilber, Dirk, Luštrek, Mitja, Fuellen, Georg
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3158465/
https://www.ncbi.nlm.nih.gov/pubmed/21791477
http://dx.doi.org/10.1093/dnares/dsr016
_version_ 1782210377778462720
author Scheubert, Lena
Schmidt, Rainer
Repsilber, Dirk
Luštrek, Mitja
Fuellen, Georg
author_facet Scheubert, Lena
Schmidt, Rainer
Repsilber, Dirk
Luštrek, Mitja
Fuellen, Georg
author_sort Scheubert, Lena
collection PubMed
description Pluripotent stem cells are able to self-renew, and to differentiate into all adult cell types. Many studies report data describing these cells, and characterize them in molecular terms. Machine learning yields classifiers that can accurately identify pluripotent stem cells, but there is a lack of studies yielding minimal sets of best biomarkers (genes/features). We assembled gene expression data of pluripotent stem cells and non-pluripotent cells from the mouse. After normalization and filtering, we applied machine learning, classifying samples into pluripotent and non-pluripotent with high cross-validated accuracy. Furthermore, to identify minimal sets of best biomarkers, we used three methods: information gain, random forests and a wrapper of genetic algorithm and support vector machine (GA/SVM). We demonstrate that the GA/SVM biomarkers work best in combination with each other; pathway and enrichment analyses show that they cover the widest variety of processes implicated in pluripotency. The GA/SVM wrapper yields best biomarkers, no matter which classification method is used. The consensus best biomarker based on the three methods is Tet1, implicated in pluripotency just recently. The best biomarker based on the GA/SVM wrapper approach alone is Fam134b, possibly a missing link between pluripotency and some standard surface markers of unknown function processed by the Golgi apparatus.
format Online
Article
Text
id pubmed-3158465
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-31584652011-08-19 Learning Biomarkers of Pluripotent Stem Cells in Mouse Scheubert, Lena Schmidt, Rainer Repsilber, Dirk Luštrek, Mitja Fuellen, Georg DNA Res Full Papers Pluripotent stem cells are able to self-renew, and to differentiate into all adult cell types. Many studies report data describing these cells, and characterize them in molecular terms. Machine learning yields classifiers that can accurately identify pluripotent stem cells, but there is a lack of studies yielding minimal sets of best biomarkers (genes/features). We assembled gene expression data of pluripotent stem cells and non-pluripotent cells from the mouse. After normalization and filtering, we applied machine learning, classifying samples into pluripotent and non-pluripotent with high cross-validated accuracy. Furthermore, to identify minimal sets of best biomarkers, we used three methods: information gain, random forests and a wrapper of genetic algorithm and support vector machine (GA/SVM). We demonstrate that the GA/SVM biomarkers work best in combination with each other; pathway and enrichment analyses show that they cover the widest variety of processes implicated in pluripotency. The GA/SVM wrapper yields best biomarkers, no matter which classification method is used. The consensus best biomarker based on the three methods is Tet1, implicated in pluripotency just recently. The best biomarker based on the GA/SVM wrapper approach alone is Fam134b, possibly a missing link between pluripotency and some standard surface markers of unknown function processed by the Golgi apparatus. Oxford University Press 2011-08 2011-07-26 /pmc/articles/PMC3158465/ /pubmed/21791477 http://dx.doi.org/10.1093/dnares/dsr016 Text en © The Author 2011. Published by Oxford University Press on behalf of Kazusa DNA Research Institute. http://creativecommons.org/licenses/by-nc/2.5/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Full Papers
Scheubert, Lena
Schmidt, Rainer
Repsilber, Dirk
Luštrek, Mitja
Fuellen, Georg
Learning Biomarkers of Pluripotent Stem Cells in Mouse
title Learning Biomarkers of Pluripotent Stem Cells in Mouse
title_full Learning Biomarkers of Pluripotent Stem Cells in Mouse
title_fullStr Learning Biomarkers of Pluripotent Stem Cells in Mouse
title_full_unstemmed Learning Biomarkers of Pluripotent Stem Cells in Mouse
title_short Learning Biomarkers of Pluripotent Stem Cells in Mouse
title_sort learning biomarkers of pluripotent stem cells in mouse
topic Full Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3158465/
https://www.ncbi.nlm.nih.gov/pubmed/21791477
http://dx.doi.org/10.1093/dnares/dsr016
work_keys_str_mv AT scheubertlena learningbiomarkersofpluripotentstemcellsinmouse
AT schmidtrainer learningbiomarkersofpluripotentstemcellsinmouse
AT repsilberdirk learningbiomarkersofpluripotentstemcellsinmouse
AT lustrekmitja learningbiomarkersofpluripotentstemcellsinmouse
AT fuellengeorg learningbiomarkersofpluripotentstemcellsinmouse