Cargando…
A machine learning approach for the identification of key markers involved in brain development from single-cell transcriptomic data
BACKGROUND: The ability to sequence the transcriptomes of single cells using single-cell RNA-seq sequencing technologies presents a shift in the scientific paradigm where scientists, now, are able to concurrently investigate the complex biology of a heterogeneous population of cells, one at a time....
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5260093/ https://www.ncbi.nlm.nih.gov/pubmed/28155657 http://dx.doi.org/10.1186/s12864-016-3317-7 |
_version_ | 1782499341596884992 |
---|---|
author | Hu, Yongli Hase, Takeshi Li, Hui Peng Prabhakar, Shyam Kitano, Hiroaki Ng, See Kiong Ghosh, Samik Wee, Lawrence Jin Kiat |
author_facet | Hu, Yongli Hase, Takeshi Li, Hui Peng Prabhakar, Shyam Kitano, Hiroaki Ng, See Kiong Ghosh, Samik Wee, Lawrence Jin Kiat |
author_sort | Hu, Yongli |
collection | PubMed |
description | BACKGROUND: The ability to sequence the transcriptomes of single cells using single-cell RNA-seq sequencing technologies presents a shift in the scientific paradigm where scientists, now, are able to concurrently investigate the complex biology of a heterogeneous population of cells, one at a time. However, till date, there has not been a suitable computational methodology for the analysis of such intricate deluge of data, in particular techniques which will aid the identification of the unique transcriptomic profiles difference between the different cellular subtypes. In this paper, we describe the novel methodology for the analysis of single-cell RNA-seq data, obtained from neocortical cells and neural progenitor cells, using machine learning algorithms (Support Vector machine (SVM) and Random Forest (RF)). RESULTS: Thirty-eight key transcripts were identified, using the SVM-based recursive feature elimination (SVM-RFE) method of feature selection, to best differentiate developing neocortical cells from neural progenitor cells in the SVM and RF classifiers built. Also, these genes possessed a higher discriminative power (enhanced prediction accuracy) as compared commonly used statistical techniques or geneset-based approaches. Further downstream network reconstruction analysis was carried out to unravel hidden general regulatory networks where novel interactions could be further validated in web-lab experimentation and be useful candidates to be targeted for the treatment of neuronal developmental diseases. CONCLUSION: This novel approach reported for is able to identify transcripts, with reported neuronal involvement, which optimally differentiate neocortical cells and neural progenitor cells. It is believed to be extensible and applicable to other single-cell RNA-seq expression profiles like that of the study of the cancer progression and treatment within a highly heterogeneous tumour. |
format | Online Article Text |
id | pubmed-5260093 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-52600932017-01-26 A machine learning approach for the identification of key markers involved in brain development from single-cell transcriptomic data Hu, Yongli Hase, Takeshi Li, Hui Peng Prabhakar, Shyam Kitano, Hiroaki Ng, See Kiong Ghosh, Samik Wee, Lawrence Jin Kiat BMC Genomics Research BACKGROUND: The ability to sequence the transcriptomes of single cells using single-cell RNA-seq sequencing technologies presents a shift in the scientific paradigm where scientists, now, are able to concurrently investigate the complex biology of a heterogeneous population of cells, one at a time. However, till date, there has not been a suitable computational methodology for the analysis of such intricate deluge of data, in particular techniques which will aid the identification of the unique transcriptomic profiles difference between the different cellular subtypes. In this paper, we describe the novel methodology for the analysis of single-cell RNA-seq data, obtained from neocortical cells and neural progenitor cells, using machine learning algorithms (Support Vector machine (SVM) and Random Forest (RF)). RESULTS: Thirty-eight key transcripts were identified, using the SVM-based recursive feature elimination (SVM-RFE) method of feature selection, to best differentiate developing neocortical cells from neural progenitor cells in the SVM and RF classifiers built. Also, these genes possessed a higher discriminative power (enhanced prediction accuracy) as compared commonly used statistical techniques or geneset-based approaches. Further downstream network reconstruction analysis was carried out to unravel hidden general regulatory networks where novel interactions could be further validated in web-lab experimentation and be useful candidates to be targeted for the treatment of neuronal developmental diseases. CONCLUSION: This novel approach reported for is able to identify transcripts, with reported neuronal involvement, which optimally differentiate neocortical cells and neural progenitor cells. It is believed to be extensible and applicable to other single-cell RNA-seq expression profiles like that of the study of the cancer progression and treatment within a highly heterogeneous tumour. BioMed Central 2016-12-22 /pmc/articles/PMC5260093/ /pubmed/28155657 http://dx.doi.org/10.1186/s12864-016-3317-7 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Hu, Yongli Hase, Takeshi Li, Hui Peng Prabhakar, Shyam Kitano, Hiroaki Ng, See Kiong Ghosh, Samik Wee, Lawrence Jin Kiat A machine learning approach for the identification of key markers involved in brain development from single-cell transcriptomic data |
title | A machine learning approach for the identification of key markers involved in brain development from single-cell transcriptomic data |
title_full | A machine learning approach for the identification of key markers involved in brain development from single-cell transcriptomic data |
title_fullStr | A machine learning approach for the identification of key markers involved in brain development from single-cell transcriptomic data |
title_full_unstemmed | A machine learning approach for the identification of key markers involved in brain development from single-cell transcriptomic data |
title_short | A machine learning approach for the identification of key markers involved in brain development from single-cell transcriptomic data |
title_sort | machine learning approach for the identification of key markers involved in brain development from single-cell transcriptomic data |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5260093/ https://www.ncbi.nlm.nih.gov/pubmed/28155657 http://dx.doi.org/10.1186/s12864-016-3317-7 |
work_keys_str_mv | AT huyongli amachinelearningapproachfortheidentificationofkeymarkersinvolvedinbraindevelopmentfromsinglecelltranscriptomicdata AT hasetakeshi amachinelearningapproachfortheidentificationofkeymarkersinvolvedinbraindevelopmentfromsinglecelltranscriptomicdata AT lihuipeng amachinelearningapproachfortheidentificationofkeymarkersinvolvedinbraindevelopmentfromsinglecelltranscriptomicdata AT prabhakarshyam amachinelearningapproachfortheidentificationofkeymarkersinvolvedinbraindevelopmentfromsinglecelltranscriptomicdata AT kitanohiroaki amachinelearningapproachfortheidentificationofkeymarkersinvolvedinbraindevelopmentfromsinglecelltranscriptomicdata AT ngseekiong amachinelearningapproachfortheidentificationofkeymarkersinvolvedinbraindevelopmentfromsinglecelltranscriptomicdata AT ghoshsamik amachinelearningapproachfortheidentificationofkeymarkersinvolvedinbraindevelopmentfromsinglecelltranscriptomicdata AT weelawrencejinkiat amachinelearningapproachfortheidentificationofkeymarkersinvolvedinbraindevelopmentfromsinglecelltranscriptomicdata AT huyongli machinelearningapproachfortheidentificationofkeymarkersinvolvedinbraindevelopmentfromsinglecelltranscriptomicdata AT hasetakeshi machinelearningapproachfortheidentificationofkeymarkersinvolvedinbraindevelopmentfromsinglecelltranscriptomicdata AT lihuipeng machinelearningapproachfortheidentificationofkeymarkersinvolvedinbraindevelopmentfromsinglecelltranscriptomicdata AT prabhakarshyam machinelearningapproachfortheidentificationofkeymarkersinvolvedinbraindevelopmentfromsinglecelltranscriptomicdata AT kitanohiroaki machinelearningapproachfortheidentificationofkeymarkersinvolvedinbraindevelopmentfromsinglecelltranscriptomicdata AT ngseekiong machinelearningapproachfortheidentificationofkeymarkersinvolvedinbraindevelopmentfromsinglecelltranscriptomicdata AT ghoshsamik machinelearningapproachfortheidentificationofkeymarkersinvolvedinbraindevelopmentfromsinglecelltranscriptomicdata AT weelawrencejinkiat machinelearningapproachfortheidentificationofkeymarkersinvolvedinbraindevelopmentfromsinglecelltranscriptomicdata |