Cargando…
A Topological Data Analysis Approach on Predicting Phenotypes from Gene Expression Data
The goal of this study was to investigate if gene expression measured from RNA sequencing contains enough signal to separate healthy and afflicted individuals in the context of phenotype prediction. We observed that standard machine learning methods alone performed somewhat poorly on the disease phe...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7197058/ http://dx.doi.org/10.1007/978-3-030-42266-0_14 |
_version_ | 1783528808686551040 |
---|---|
author | Mandal, Sayan Guzmán-Sáenz, Aldo Haiminen, Niina Basu, Saugata Parida, Laxmi |
author_facet | Mandal, Sayan Guzmán-Sáenz, Aldo Haiminen, Niina Basu, Saugata Parida, Laxmi |
author_sort | Mandal, Sayan |
collection | PubMed |
description | The goal of this study was to investigate if gene expression measured from RNA sequencing contains enough signal to separate healthy and afflicted individuals in the context of phenotype prediction. We observed that standard machine learning methods alone performed somewhat poorly on the disease phenotype prediction task; therefore we devised an approach augmenting machine learning with topological data analysis. We describe a framework for predicting phenotype values by utilizing gene expression data transformed into sample-specific topological signatures by employing feature subsampling and persistent homology. The topological data analysis approach developed in this work yielded improved results on Parkinson’s disease phenotype prediction when measured against standard machine learning methods. This study confirms that gene expression can be a useful indicator of the presence or absence of a condition, and the subtle signal contained in this high dimensional data reveals itself when considering the intricate topological connections between expressed genes. |
format | Online Article Text |
id | pubmed-7197058 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
record_format | MEDLINE/PubMed |
spelling | pubmed-71970582020-05-04 A Topological Data Analysis Approach on Predicting Phenotypes from Gene Expression Data Mandal, Sayan Guzmán-Sáenz, Aldo Haiminen, Niina Basu, Saugata Parida, Laxmi Algorithms for Computational Biology Article The goal of this study was to investigate if gene expression measured from RNA sequencing contains enough signal to separate healthy and afflicted individuals in the context of phenotype prediction. We observed that standard machine learning methods alone performed somewhat poorly on the disease phenotype prediction task; therefore we devised an approach augmenting machine learning with topological data analysis. We describe a framework for predicting phenotype values by utilizing gene expression data transformed into sample-specific topological signatures by employing feature subsampling and persistent homology. The topological data analysis approach developed in this work yielded improved results on Parkinson’s disease phenotype prediction when measured against standard machine learning methods. This study confirms that gene expression can be a useful indicator of the presence or absence of a condition, and the subtle signal contained in this high dimensional data reveals itself when considering the intricate topological connections between expressed genes. 2020-02-01 /pmc/articles/PMC7197058/ http://dx.doi.org/10.1007/978-3-030-42266-0_14 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Article Mandal, Sayan Guzmán-Sáenz, Aldo Haiminen, Niina Basu, Saugata Parida, Laxmi A Topological Data Analysis Approach on Predicting Phenotypes from Gene Expression Data |
title | A Topological Data Analysis Approach on Predicting Phenotypes from Gene Expression Data |
title_full | A Topological Data Analysis Approach on Predicting Phenotypes from Gene Expression Data |
title_fullStr | A Topological Data Analysis Approach on Predicting Phenotypes from Gene Expression Data |
title_full_unstemmed | A Topological Data Analysis Approach on Predicting Phenotypes from Gene Expression Data |
title_short | A Topological Data Analysis Approach on Predicting Phenotypes from Gene Expression Data |
title_sort | topological data analysis approach on predicting phenotypes from gene expression data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7197058/ http://dx.doi.org/10.1007/978-3-030-42266-0_14 |
work_keys_str_mv | AT mandalsayan atopologicaldataanalysisapproachonpredictingphenotypesfromgeneexpressiondata AT guzmansaenzaldo atopologicaldataanalysisapproachonpredictingphenotypesfromgeneexpressiondata AT haiminenniina atopologicaldataanalysisapproachonpredictingphenotypesfromgeneexpressiondata AT basusaugata atopologicaldataanalysisapproachonpredictingphenotypesfromgeneexpressiondata AT paridalaxmi atopologicaldataanalysisapproachonpredictingphenotypesfromgeneexpressiondata AT mandalsayan topologicaldataanalysisapproachonpredictingphenotypesfromgeneexpressiondata AT guzmansaenzaldo topologicaldataanalysisapproachonpredictingphenotypesfromgeneexpressiondata AT haiminenniina topologicaldataanalysisapproachonpredictingphenotypesfromgeneexpressiondata AT basusaugata topologicaldataanalysisapproachonpredictingphenotypesfromgeneexpressiondata AT paridalaxmi topologicaldataanalysisapproachonpredictingphenotypesfromgeneexpressiondata |