Cargando…
A network approach for low dimensional signatures from high throughput data
One of the main objectives of high-throughput genomics studies is to obtain a low-dimensional set of observables—a signature—for sample classification purposes (diagnosis, prognosis, stratification). Biological data, such as gene or protein expression, are commonly characterized by an up/down regula...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9789141/ https://www.ncbi.nlm.nih.gov/pubmed/36564421 http://dx.doi.org/10.1038/s41598-022-25549-9 |
_version_ | 1784858911347572736 |
---|---|
author | Curti, Nico Levi, Giuseppe Giampieri, Enrico Castellani, Gastone Remondini, Daniel |
author_facet | Curti, Nico Levi, Giuseppe Giampieri, Enrico Castellani, Gastone Remondini, Daniel |
author_sort | Curti, Nico |
collection | PubMed |
description | One of the main objectives of high-throughput genomics studies is to obtain a low-dimensional set of observables—a signature—for sample classification purposes (diagnosis, prognosis, stratification). Biological data, such as gene or protein expression, are commonly characterized by an up/down regulation behavior, for which discriminant-based methods could perform with high accuracy and easy interpretability. To obtain the most out of these methods features selection is even more critical, but it is known to be a NP-hard problem, and thus most feature selection approaches focuses on one feature at the time (k-best, Sequential Feature Selection, recursive feature elimination). We propose DNetPRO, Discriminant Analysis with Network PROcessing, a supervised network-based signature identification method. This method implements a network-based heuristic to generate one or more signatures out of the best performing feature pairs. The algorithm is easily scalable, allowing efficient computing for high number of observables ([Formula: see text] –[Formula: see text] ). We show applications on real high-throughput genomic datasets in which our method outperforms existing results, or is compatible with them but with a smaller number of selected features. Moreover, the geometrical simplicity of the resulting class-separation surfaces allows a clearer interpretation of the obtained signatures in comparison to nonlinear classification models. |
format | Online Article Text |
id | pubmed-9789141 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-97891412022-12-25 A network approach for low dimensional signatures from high throughput data Curti, Nico Levi, Giuseppe Giampieri, Enrico Castellani, Gastone Remondini, Daniel Sci Rep Article One of the main objectives of high-throughput genomics studies is to obtain a low-dimensional set of observables—a signature—for sample classification purposes (diagnosis, prognosis, stratification). Biological data, such as gene or protein expression, are commonly characterized by an up/down regulation behavior, for which discriminant-based methods could perform with high accuracy and easy interpretability. To obtain the most out of these methods features selection is even more critical, but it is known to be a NP-hard problem, and thus most feature selection approaches focuses on one feature at the time (k-best, Sequential Feature Selection, recursive feature elimination). We propose DNetPRO, Discriminant Analysis with Network PROcessing, a supervised network-based signature identification method. This method implements a network-based heuristic to generate one or more signatures out of the best performing feature pairs. The algorithm is easily scalable, allowing efficient computing for high number of observables ([Formula: see text] –[Formula: see text] ). We show applications on real high-throughput genomic datasets in which our method outperforms existing results, or is compatible with them but with a smaller number of selected features. Moreover, the geometrical simplicity of the resulting class-separation surfaces allows a clearer interpretation of the obtained signatures in comparison to nonlinear classification models. Nature Publishing Group UK 2022-12-23 /pmc/articles/PMC9789141/ /pubmed/36564421 http://dx.doi.org/10.1038/s41598-022-25549-9 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Curti, Nico Levi, Giuseppe Giampieri, Enrico Castellani, Gastone Remondini, Daniel A network approach for low dimensional signatures from high throughput data |
title | A network approach for low dimensional signatures from high throughput data |
title_full | A network approach for low dimensional signatures from high throughput data |
title_fullStr | A network approach for low dimensional signatures from high throughput data |
title_full_unstemmed | A network approach for low dimensional signatures from high throughput data |
title_short | A network approach for low dimensional signatures from high throughput data |
title_sort | network approach for low dimensional signatures from high throughput data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9789141/ https://www.ncbi.nlm.nih.gov/pubmed/36564421 http://dx.doi.org/10.1038/s41598-022-25549-9 |
work_keys_str_mv | AT curtinico anetworkapproachforlowdimensionalsignaturesfromhighthroughputdata AT levigiuseppe anetworkapproachforlowdimensionalsignaturesfromhighthroughputdata AT giampierienrico anetworkapproachforlowdimensionalsignaturesfromhighthroughputdata AT castellanigastone anetworkapproachforlowdimensionalsignaturesfromhighthroughputdata AT remondinidaniel anetworkapproachforlowdimensionalsignaturesfromhighthroughputdata AT curtinico networkapproachforlowdimensionalsignaturesfromhighthroughputdata AT levigiuseppe networkapproachforlowdimensionalsignaturesfromhighthroughputdata AT giampierienrico networkapproachforlowdimensionalsignaturesfromhighthroughputdata AT castellanigastone networkapproachforlowdimensionalsignaturesfromhighthroughputdata AT remondinidaniel networkapproachforlowdimensionalsignaturesfromhighthroughputdata |