Cargando…

A component overlapping attribute clustering (COAC) algorithm for single-cell RNA sequencing data analysis and potential pathobiological implications

Recent advances in next-generation sequencing and computational technologies have enabled routine analysis of large-scale single-cell ribonucleic acid sequencing (scRNA-seq) data. However, scRNA-seq technologies have suffered from several technical challenges, including low mean expression levels in...

Descripción completa

Detalles Bibliográficos
Autores principales: Peng, He, Zeng, Xiangxiang, Zhou, Yadi, Zhang, Defu, Nussinov, Ruth, Cheng, Feixiong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6396937/
https://www.ncbi.nlm.nih.gov/pubmed/30779739
http://dx.doi.org/10.1371/journal.pcbi.1006772
_version_ 1783399346096570368
author Peng, He
Zeng, Xiangxiang
Zhou, Yadi
Zhang, Defu
Nussinov, Ruth
Cheng, Feixiong
author_facet Peng, He
Zeng, Xiangxiang
Zhou, Yadi
Zhang, Defu
Nussinov, Ruth
Cheng, Feixiong
author_sort Peng, He
collection PubMed
description Recent advances in next-generation sequencing and computational technologies have enabled routine analysis of large-scale single-cell ribonucleic acid sequencing (scRNA-seq) data. However, scRNA-seq technologies have suffered from several technical challenges, including low mean expression levels in most genes and higher frequencies of missing data than bulk population sequencing technologies. Identifying functional gene sets and their regulatory networks that link specific cell types to human diseases and therapeutics from scRNA-seq profiles are daunting tasks. In this study, we developed a Component Overlapping Attribute Clustering (COAC) algorithm to perform the localized (cell subpopulation) gene co-expression network analysis from large-scale scRNA-seq profiles. Gene subnetworks that represent specific gene co-expression patterns are inferred from the components of a decomposed matrix of scRNA-seq profiles. We showed that single-cell gene subnetworks identified by COAC from multiple time points within cell phases can be used for cell type identification with high accuracy (83%). In addition, COAC-inferred subnetworks from melanoma patients’ scRNA-seq profiles are highly correlated with survival rate from The Cancer Genome Atlas (TCGA). Moreover, the localized gene subnetworks identified by COAC from individual patients’ scRNA-seq data can be used as pharmacogenomics biomarkers to predict drug responses (The area under the receiver operating characteristic curves ranges from 0.728 to 0.783) in cancer cell lines from the Genomics of Drug Sensitivity in Cancer (GDSC) database. In summary, COAC offers a powerful tool to identify potential network-based diagnostic and pharmacogenomics biomarkers from large-scale scRNA-seq profiles. COAC is freely available at https://github.com/ChengF-Lab/COAC.
format Online
Article
Text
id pubmed-6396937
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-63969372019-03-09 A component overlapping attribute clustering (COAC) algorithm for single-cell RNA sequencing data analysis and potential pathobiological implications Peng, He Zeng, Xiangxiang Zhou, Yadi Zhang, Defu Nussinov, Ruth Cheng, Feixiong PLoS Comput Biol Research Article Recent advances in next-generation sequencing and computational technologies have enabled routine analysis of large-scale single-cell ribonucleic acid sequencing (scRNA-seq) data. However, scRNA-seq technologies have suffered from several technical challenges, including low mean expression levels in most genes and higher frequencies of missing data than bulk population sequencing technologies. Identifying functional gene sets and their regulatory networks that link specific cell types to human diseases and therapeutics from scRNA-seq profiles are daunting tasks. In this study, we developed a Component Overlapping Attribute Clustering (COAC) algorithm to perform the localized (cell subpopulation) gene co-expression network analysis from large-scale scRNA-seq profiles. Gene subnetworks that represent specific gene co-expression patterns are inferred from the components of a decomposed matrix of scRNA-seq profiles. We showed that single-cell gene subnetworks identified by COAC from multiple time points within cell phases can be used for cell type identification with high accuracy (83%). In addition, COAC-inferred subnetworks from melanoma patients’ scRNA-seq profiles are highly correlated with survival rate from The Cancer Genome Atlas (TCGA). Moreover, the localized gene subnetworks identified by COAC from individual patients’ scRNA-seq data can be used as pharmacogenomics biomarkers to predict drug responses (The area under the receiver operating characteristic curves ranges from 0.728 to 0.783) in cancer cell lines from the Genomics of Drug Sensitivity in Cancer (GDSC) database. In summary, COAC offers a powerful tool to identify potential network-based diagnostic and pharmacogenomics biomarkers from large-scale scRNA-seq profiles. COAC is freely available at https://github.com/ChengF-Lab/COAC. Public Library of Science 2019-02-19 /pmc/articles/PMC6396937/ /pubmed/30779739 http://dx.doi.org/10.1371/journal.pcbi.1006772 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 (https://creativecommons.org/publicdomain/zero/1.0/) public domain dedication.
spellingShingle Research Article
Peng, He
Zeng, Xiangxiang
Zhou, Yadi
Zhang, Defu
Nussinov, Ruth
Cheng, Feixiong
A component overlapping attribute clustering (COAC) algorithm for single-cell RNA sequencing data analysis and potential pathobiological implications
title A component overlapping attribute clustering (COAC) algorithm for single-cell RNA sequencing data analysis and potential pathobiological implications
title_full A component overlapping attribute clustering (COAC) algorithm for single-cell RNA sequencing data analysis and potential pathobiological implications
title_fullStr A component overlapping attribute clustering (COAC) algorithm for single-cell RNA sequencing data analysis and potential pathobiological implications
title_full_unstemmed A component overlapping attribute clustering (COAC) algorithm for single-cell RNA sequencing data analysis and potential pathobiological implications
title_short A component overlapping attribute clustering (COAC) algorithm for single-cell RNA sequencing data analysis and potential pathobiological implications
title_sort component overlapping attribute clustering (coac) algorithm for single-cell rna sequencing data analysis and potential pathobiological implications
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6396937/
https://www.ncbi.nlm.nih.gov/pubmed/30779739
http://dx.doi.org/10.1371/journal.pcbi.1006772
work_keys_str_mv AT penghe acomponentoverlappingattributeclusteringcoacalgorithmforsinglecellrnasequencingdataanalysisandpotentialpathobiologicalimplications
AT zengxiangxiang acomponentoverlappingattributeclusteringcoacalgorithmforsinglecellrnasequencingdataanalysisandpotentialpathobiologicalimplications
AT zhouyadi acomponentoverlappingattributeclusteringcoacalgorithmforsinglecellrnasequencingdataanalysisandpotentialpathobiologicalimplications
AT zhangdefu acomponentoverlappingattributeclusteringcoacalgorithmforsinglecellrnasequencingdataanalysisandpotentialpathobiologicalimplications
AT nussinovruth acomponentoverlappingattributeclusteringcoacalgorithmforsinglecellrnasequencingdataanalysisandpotentialpathobiologicalimplications
AT chengfeixiong acomponentoverlappingattributeclusteringcoacalgorithmforsinglecellrnasequencingdataanalysisandpotentialpathobiologicalimplications
AT penghe componentoverlappingattributeclusteringcoacalgorithmforsinglecellrnasequencingdataanalysisandpotentialpathobiologicalimplications
AT zengxiangxiang componentoverlappingattributeclusteringcoacalgorithmforsinglecellrnasequencingdataanalysisandpotentialpathobiologicalimplications
AT zhouyadi componentoverlappingattributeclusteringcoacalgorithmforsinglecellrnasequencingdataanalysisandpotentialpathobiologicalimplications
AT zhangdefu componentoverlappingattributeclusteringcoacalgorithmforsinglecellrnasequencingdataanalysisandpotentialpathobiologicalimplications
AT nussinovruth componentoverlappingattributeclusteringcoacalgorithmforsinglecellrnasequencingdataanalysisandpotentialpathobiologicalimplications
AT chengfeixiong componentoverlappingattributeclusteringcoacalgorithmforsinglecellrnasequencingdataanalysisandpotentialpathobiologicalimplications