Cargando…

A network approach for low dimensional signatures from high throughput data

One of the main objectives of high-throughput genomics studies is to obtain a low-dimensional set of observables—a signature—for sample classification purposes (diagnosis, prognosis, stratification). Biological data, such as gene or protein expression, are commonly characterized by an up/down regula...

Descripción completa

Detalles Bibliográficos
Autores principales: Curti, Nico, Levi, Giuseppe, Giampieri, Enrico, Castellani, Gastone, Remondini, Daniel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9789141/
https://www.ncbi.nlm.nih.gov/pubmed/36564421
http://dx.doi.org/10.1038/s41598-022-25549-9
_version_ 1784858911347572736
author Curti, Nico
Levi, Giuseppe
Giampieri, Enrico
Castellani, Gastone
Remondini, Daniel
author_facet Curti, Nico
Levi, Giuseppe
Giampieri, Enrico
Castellani, Gastone
Remondini, Daniel
author_sort Curti, Nico
collection PubMed
description One of the main objectives of high-throughput genomics studies is to obtain a low-dimensional set of observables—a signature—for sample classification purposes (diagnosis, prognosis, stratification). Biological data, such as gene or protein expression, are commonly characterized by an up/down regulation behavior, for which discriminant-based methods could perform with high accuracy and easy interpretability. To obtain the most out of these methods features selection is even more critical, but it is known to be a NP-hard problem, and thus most feature selection approaches focuses on one feature at the time (k-best, Sequential Feature Selection, recursive feature elimination). We propose DNetPRO, Discriminant Analysis with Network PROcessing, a supervised network-based signature identification method. This method implements a network-based heuristic to generate one or more signatures out of the best performing feature pairs. The algorithm is easily scalable, allowing efficient computing for high number of observables ([Formula: see text] –[Formula: see text] ). We show applications on real high-throughput genomic datasets in which our method outperforms existing results, or is compatible with them but with a smaller number of selected features. Moreover, the geometrical simplicity of the resulting class-separation surfaces allows a clearer interpretation of the obtained signatures in comparison to nonlinear classification models.
format Online
Article
Text
id pubmed-9789141
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-97891412022-12-25 A network approach for low dimensional signatures from high throughput data Curti, Nico Levi, Giuseppe Giampieri, Enrico Castellani, Gastone Remondini, Daniel Sci Rep Article One of the main objectives of high-throughput genomics studies is to obtain a low-dimensional set of observables—a signature—for sample classification purposes (diagnosis, prognosis, stratification). Biological data, such as gene or protein expression, are commonly characterized by an up/down regulation behavior, for which discriminant-based methods could perform with high accuracy and easy interpretability. To obtain the most out of these methods features selection is even more critical, but it is known to be a NP-hard problem, and thus most feature selection approaches focuses on one feature at the time (k-best, Sequential Feature Selection, recursive feature elimination). We propose DNetPRO, Discriminant Analysis with Network PROcessing, a supervised network-based signature identification method. This method implements a network-based heuristic to generate one or more signatures out of the best performing feature pairs. The algorithm is easily scalable, allowing efficient computing for high number of observables ([Formula: see text] –[Formula: see text] ). We show applications on real high-throughput genomic datasets in which our method outperforms existing results, or is compatible with them but with a smaller number of selected features. Moreover, the geometrical simplicity of the resulting class-separation surfaces allows a clearer interpretation of the obtained signatures in comparison to nonlinear classification models. Nature Publishing Group UK 2022-12-23 /pmc/articles/PMC9789141/ /pubmed/36564421 http://dx.doi.org/10.1038/s41598-022-25549-9 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Curti, Nico
Levi, Giuseppe
Giampieri, Enrico
Castellani, Gastone
Remondini, Daniel
A network approach for low dimensional signatures from high throughput data
title A network approach for low dimensional signatures from high throughput data
title_full A network approach for low dimensional signatures from high throughput data
title_fullStr A network approach for low dimensional signatures from high throughput data
title_full_unstemmed A network approach for low dimensional signatures from high throughput data
title_short A network approach for low dimensional signatures from high throughput data
title_sort network approach for low dimensional signatures from high throughput data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9789141/
https://www.ncbi.nlm.nih.gov/pubmed/36564421
http://dx.doi.org/10.1038/s41598-022-25549-9
work_keys_str_mv AT curtinico anetworkapproachforlowdimensionalsignaturesfromhighthroughputdata
AT levigiuseppe anetworkapproachforlowdimensionalsignaturesfromhighthroughputdata
AT giampierienrico anetworkapproachforlowdimensionalsignaturesfromhighthroughputdata
AT castellanigastone anetworkapproachforlowdimensionalsignaturesfromhighthroughputdata
AT remondinidaniel anetworkapproachforlowdimensionalsignaturesfromhighthroughputdata
AT curtinico networkapproachforlowdimensionalsignaturesfromhighthroughputdata
AT levigiuseppe networkapproachforlowdimensionalsignaturesfromhighthroughputdata
AT giampierienrico networkapproachforlowdimensionalsignaturesfromhighthroughputdata
AT castellanigastone networkapproachforlowdimensionalsignaturesfromhighthroughputdata
AT remondinidaniel networkapproachforlowdimensionalsignaturesfromhighthroughputdata