Cargando…

SuMO-Fil: Supervised multi-omic filtering prior to performing network analysis

Multi-omic analyses that integrate many high-dimensional datasets often present significant deficiencies in statistical power and require time consuming computations to execute the analytical methods. We present SuMO-Fil to remedy against these issues which is a pre-processing method for Supervised...

Descripción completa

Detalles Bibliográficos
Autores principales: Towle-Miller, Lorin M., Miecznikowski, Jeffrey C., Zhang, Fan, Tritchler, David L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8330944/
https://www.ncbi.nlm.nih.gov/pubmed/34343218
http://dx.doi.org/10.1371/journal.pone.0255579
_version_ 1783732830124113920
author Towle-Miller, Lorin M.
Miecznikowski, Jeffrey C.
Zhang, Fan
Tritchler, David L.
author_facet Towle-Miller, Lorin M.
Miecznikowski, Jeffrey C.
Zhang, Fan
Tritchler, David L.
author_sort Towle-Miller, Lorin M.
collection PubMed
description Multi-omic analyses that integrate many high-dimensional datasets often present significant deficiencies in statistical power and require time consuming computations to execute the analytical methods. We present SuMO-Fil to remedy against these issues which is a pre-processing method for Supervised Multi-Omic Filtering that removes variables or features considered to be irrelevant noise. SuMO-Fil is intended to be performed prior to downstream analyses that detect supervised gene networks in sparse settings. We accomplish this by implementing variable filters based on low similarity across the datasets in conjunction with low similarity with the outcome. This approach can improve accuracy, as well as reduce run times for a variety of computationally expensive downstream analyses. This method has applications in a setting where the downstream analysis may include sparse canonical correlation analysis. Filtering methods specifically for cluster and network analysis are introduced and compared by simulating modular networks with known statistical properties. The SuMO-Fil method performs favorably by eliminating non-network features while maintaining important biological signal under a variety of different signal settings as compared to popular filtering techniques based on low means or low variances. We show that the speed and accuracy of methods such as supervised sparse canonical correlation are increased after using SuMO-Fil, thus greatly improving the scalability of these approaches.
format Online
Article
Text
id pubmed-8330944
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-83309442021-08-04 SuMO-Fil: Supervised multi-omic filtering prior to performing network analysis Towle-Miller, Lorin M. Miecznikowski, Jeffrey C. Zhang, Fan Tritchler, David L. PLoS One Research Article Multi-omic analyses that integrate many high-dimensional datasets often present significant deficiencies in statistical power and require time consuming computations to execute the analytical methods. We present SuMO-Fil to remedy against these issues which is a pre-processing method for Supervised Multi-Omic Filtering that removes variables or features considered to be irrelevant noise. SuMO-Fil is intended to be performed prior to downstream analyses that detect supervised gene networks in sparse settings. We accomplish this by implementing variable filters based on low similarity across the datasets in conjunction with low similarity with the outcome. This approach can improve accuracy, as well as reduce run times for a variety of computationally expensive downstream analyses. This method has applications in a setting where the downstream analysis may include sparse canonical correlation analysis. Filtering methods specifically for cluster and network analysis are introduced and compared by simulating modular networks with known statistical properties. The SuMO-Fil method performs favorably by eliminating non-network features while maintaining important biological signal under a variety of different signal settings as compared to popular filtering techniques based on low means or low variances. We show that the speed and accuracy of methods such as supervised sparse canonical correlation are increased after using SuMO-Fil, thus greatly improving the scalability of these approaches. Public Library of Science 2021-08-03 /pmc/articles/PMC8330944/ /pubmed/34343218 http://dx.doi.org/10.1371/journal.pone.0255579 Text en © 2021 Towle-Miller et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Towle-Miller, Lorin M.
Miecznikowski, Jeffrey C.
Zhang, Fan
Tritchler, David L.
SuMO-Fil: Supervised multi-omic filtering prior to performing network analysis
title SuMO-Fil: Supervised multi-omic filtering prior to performing network analysis
title_full SuMO-Fil: Supervised multi-omic filtering prior to performing network analysis
title_fullStr SuMO-Fil: Supervised multi-omic filtering prior to performing network analysis
title_full_unstemmed SuMO-Fil: Supervised multi-omic filtering prior to performing network analysis
title_short SuMO-Fil: Supervised multi-omic filtering prior to performing network analysis
title_sort sumo-fil: supervised multi-omic filtering prior to performing network analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8330944/
https://www.ncbi.nlm.nih.gov/pubmed/34343218
http://dx.doi.org/10.1371/journal.pone.0255579
work_keys_str_mv AT towlemillerlorinm sumofilsupervisedmultiomicfilteringpriortoperformingnetworkanalysis
AT miecznikowskijeffreyc sumofilsupervisedmultiomicfilteringpriortoperformingnetworkanalysis
AT zhangfan sumofilsupervisedmultiomicfilteringpriortoperformingnetworkanalysis
AT tritchlerdavidl sumofilsupervisedmultiomicfilteringpriortoperformingnetworkanalysis