Cargando…

A data analysis framework for biomedical big data: Application on mesoderm differentiation of human pluripotent stem cells

The development of high-throughput biomolecular technologies has resulted in generation of vast omics data at an unprecedented rate. This is transforming biomedical research into a big data discipline, where the main challenges relate to the analysis and interpretation of data into new biological kn...

Descripción completa

Detalles Bibliográficos
Autores principales: Ulfenborg, Benjamin, Karlsson, Alexander, Riveiro, Maria, Améen, Caroline, Åkesson, Karolina, Andersson, Christian X., Sartipy, Peter, Synnergren, Jane
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5487013/
https://www.ncbi.nlm.nih.gov/pubmed/28654683
http://dx.doi.org/10.1371/journal.pone.0179613
_version_ 1783246372665819136
author Ulfenborg, Benjamin
Karlsson, Alexander
Riveiro, Maria
Améen, Caroline
Åkesson, Karolina
Andersson, Christian X.
Sartipy, Peter
Synnergren, Jane
author_facet Ulfenborg, Benjamin
Karlsson, Alexander
Riveiro, Maria
Améen, Caroline
Åkesson, Karolina
Andersson, Christian X.
Sartipy, Peter
Synnergren, Jane
author_sort Ulfenborg, Benjamin
collection PubMed
description The development of high-throughput biomolecular technologies has resulted in generation of vast omics data at an unprecedented rate. This is transforming biomedical research into a big data discipline, where the main challenges relate to the analysis and interpretation of data into new biological knowledge. The aim of this study was to develop a framework for biomedical big data analytics, and apply it for analyzing transcriptomics time series data from early differentiation of human pluripotent stem cells towards the mesoderm and cardiac lineages. To this end, transcriptome profiling by microarray was performed on differentiating human pluripotent stem cells sampled at eleven consecutive days. The gene expression data was analyzed using the five-stage analysis framework proposed in this study, including data preparation, exploratory data analysis, confirmatory analysis, biological knowledge discovery, and visualization of the results. Clustering analysis revealed several distinct expression profiles during differentiation. Genes with an early transient response were strongly related to embryonic- and mesendoderm development, for example CER1 and NODAL. Pluripotency genes, such as NANOG and SOX2, exhibited substantial downregulation shortly after onset of differentiation. Rapid induction of genes related to metal ion response, cardiac tissue development, and muscle contraction were observed around day five and six. Several transcription factors were identified as potential regulators of these processes, e.g. POU1F1, TCF4 and TBP for muscle contraction genes. Pathway analysis revealed temporal activity of several signaling pathways, for example the inhibition of WNT signaling on day 2 and its reactivation on day 4. This study provides a comprehensive characterization of biological events and key regulators of the early differentiation of human pluripotent stem cells towards the mesoderm and cardiac lineages. The proposed analysis framework can be used to structure data analysis in future research, both in stem cell differentiation, and more generally, in biomedical big data analytics.
format Online
Article
Text
id pubmed-5487013
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-54870132017-07-11 A data analysis framework for biomedical big data: Application on mesoderm differentiation of human pluripotent stem cells Ulfenborg, Benjamin Karlsson, Alexander Riveiro, Maria Améen, Caroline Åkesson, Karolina Andersson, Christian X. Sartipy, Peter Synnergren, Jane PLoS One Research Article The development of high-throughput biomolecular technologies has resulted in generation of vast omics data at an unprecedented rate. This is transforming biomedical research into a big data discipline, where the main challenges relate to the analysis and interpretation of data into new biological knowledge. The aim of this study was to develop a framework for biomedical big data analytics, and apply it for analyzing transcriptomics time series data from early differentiation of human pluripotent stem cells towards the mesoderm and cardiac lineages. To this end, transcriptome profiling by microarray was performed on differentiating human pluripotent stem cells sampled at eleven consecutive days. The gene expression data was analyzed using the five-stage analysis framework proposed in this study, including data preparation, exploratory data analysis, confirmatory analysis, biological knowledge discovery, and visualization of the results. Clustering analysis revealed several distinct expression profiles during differentiation. Genes with an early transient response were strongly related to embryonic- and mesendoderm development, for example CER1 and NODAL. Pluripotency genes, such as NANOG and SOX2, exhibited substantial downregulation shortly after onset of differentiation. Rapid induction of genes related to metal ion response, cardiac tissue development, and muscle contraction were observed around day five and six. Several transcription factors were identified as potential regulators of these processes, e.g. POU1F1, TCF4 and TBP for muscle contraction genes. Pathway analysis revealed temporal activity of several signaling pathways, for example the inhibition of WNT signaling on day 2 and its reactivation on day 4. This study provides a comprehensive characterization of biological events and key regulators of the early differentiation of human pluripotent stem cells towards the mesoderm and cardiac lineages. The proposed analysis framework can be used to structure data analysis in future research, both in stem cell differentiation, and more generally, in biomedical big data analytics. Public Library of Science 2017-06-27 /pmc/articles/PMC5487013/ /pubmed/28654683 http://dx.doi.org/10.1371/journal.pone.0179613 Text en © 2017 Ulfenborg et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Ulfenborg, Benjamin
Karlsson, Alexander
Riveiro, Maria
Améen, Caroline
Åkesson, Karolina
Andersson, Christian X.
Sartipy, Peter
Synnergren, Jane
A data analysis framework for biomedical big data: Application on mesoderm differentiation of human pluripotent stem cells
title A data analysis framework for biomedical big data: Application on mesoderm differentiation of human pluripotent stem cells
title_full A data analysis framework for biomedical big data: Application on mesoderm differentiation of human pluripotent stem cells
title_fullStr A data analysis framework for biomedical big data: Application on mesoderm differentiation of human pluripotent stem cells
title_full_unstemmed A data analysis framework for biomedical big data: Application on mesoderm differentiation of human pluripotent stem cells
title_short A data analysis framework for biomedical big data: Application on mesoderm differentiation of human pluripotent stem cells
title_sort data analysis framework for biomedical big data: application on mesoderm differentiation of human pluripotent stem cells
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5487013/
https://www.ncbi.nlm.nih.gov/pubmed/28654683
http://dx.doi.org/10.1371/journal.pone.0179613
work_keys_str_mv AT ulfenborgbenjamin adataanalysisframeworkforbiomedicalbigdataapplicationonmesodermdifferentiationofhumanpluripotentstemcells
AT karlssonalexander adataanalysisframeworkforbiomedicalbigdataapplicationonmesodermdifferentiationofhumanpluripotentstemcells
AT riveiromaria adataanalysisframeworkforbiomedicalbigdataapplicationonmesodermdifferentiationofhumanpluripotentstemcells
AT ameencaroline adataanalysisframeworkforbiomedicalbigdataapplicationonmesodermdifferentiationofhumanpluripotentstemcells
AT akessonkarolina adataanalysisframeworkforbiomedicalbigdataapplicationonmesodermdifferentiationofhumanpluripotentstemcells
AT anderssonchristianx adataanalysisframeworkforbiomedicalbigdataapplicationonmesodermdifferentiationofhumanpluripotentstemcells
AT sartipypeter adataanalysisframeworkforbiomedicalbigdataapplicationonmesodermdifferentiationofhumanpluripotentstemcells
AT synnergrenjane adataanalysisframeworkforbiomedicalbigdataapplicationonmesodermdifferentiationofhumanpluripotentstemcells
AT ulfenborgbenjamin dataanalysisframeworkforbiomedicalbigdataapplicationonmesodermdifferentiationofhumanpluripotentstemcells
AT karlssonalexander dataanalysisframeworkforbiomedicalbigdataapplicationonmesodermdifferentiationofhumanpluripotentstemcells
AT riveiromaria dataanalysisframeworkforbiomedicalbigdataapplicationonmesodermdifferentiationofhumanpluripotentstemcells
AT ameencaroline dataanalysisframeworkforbiomedicalbigdataapplicationonmesodermdifferentiationofhumanpluripotentstemcells
AT akessonkarolina dataanalysisframeworkforbiomedicalbigdataapplicationonmesodermdifferentiationofhumanpluripotentstemcells
AT anderssonchristianx dataanalysisframeworkforbiomedicalbigdataapplicationonmesodermdifferentiationofhumanpluripotentstemcells
AT sartipypeter dataanalysisframeworkforbiomedicalbigdataapplicationonmesodermdifferentiationofhumanpluripotentstemcells
AT synnergrenjane dataanalysisframeworkforbiomedicalbigdataapplicationonmesodermdifferentiationofhumanpluripotentstemcells