Cargando…

Relationship between gene regulation network structure and prediction accuracy in high dimensional regression

The least absolute shrinkage and selection operator (lasso) and principal component regression (PCR) are popular methods of estimating traits from high-dimensional omics data, such as transcriptomes. The prediction accuracy of these estimation methods is highly dependent on the covariance structure,...

Descripción completa

Detalles Bibliográficos
Autores principales: Okinaga, Yuichi, Kyogoku, Daisuke, Kondo, Satoshi, Nagano, Atsushi J., Hirose, Kei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8169869/
https://www.ncbi.nlm.nih.gov/pubmed/34075095
http://dx.doi.org/10.1038/s41598-021-90791-6
_version_ 1783702114547007488
author Okinaga, Yuichi
Kyogoku, Daisuke
Kondo, Satoshi
Nagano, Atsushi J.
Hirose, Kei
author_facet Okinaga, Yuichi
Kyogoku, Daisuke
Kondo, Satoshi
Nagano, Atsushi J.
Hirose, Kei
author_sort Okinaga, Yuichi
collection PubMed
description The least absolute shrinkage and selection operator (lasso) and principal component regression (PCR) are popular methods of estimating traits from high-dimensional omics data, such as transcriptomes. The prediction accuracy of these estimation methods is highly dependent on the covariance structure, which is characterized by gene regulation networks. However, the manner in which the structure of a gene regulation network together with the sample size affects prediction accuracy has not yet been sufficiently investigated. In this study, Monte Carlo simulations are conducted to investigate the prediction accuracy for several network structures under various sample sizes. When the gene regulation network is a random graph, a sufficiently large number of observations are required to ensure good prediction accuracy with the lasso. The PCR provided poor prediction accuracy regardless of the sample size. However, a real gene regulation network is likely to exhibit a scale-free structure. In such cases, the simulation indicates that a relatively small number of observations, such as [Formula: see text] , is sufficient to allow the accurate prediction of traits from a transcriptome with the lasso.
format Online
Article
Text
id pubmed-8169869
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-81698692021-06-03 Relationship between gene regulation network structure and prediction accuracy in high dimensional regression Okinaga, Yuichi Kyogoku, Daisuke Kondo, Satoshi Nagano, Atsushi J. Hirose, Kei Sci Rep Article The least absolute shrinkage and selection operator (lasso) and principal component regression (PCR) are popular methods of estimating traits from high-dimensional omics data, such as transcriptomes. The prediction accuracy of these estimation methods is highly dependent on the covariance structure, which is characterized by gene regulation networks. However, the manner in which the structure of a gene regulation network together with the sample size affects prediction accuracy has not yet been sufficiently investigated. In this study, Monte Carlo simulations are conducted to investigate the prediction accuracy for several network structures under various sample sizes. When the gene regulation network is a random graph, a sufficiently large number of observations are required to ensure good prediction accuracy with the lasso. The PCR provided poor prediction accuracy regardless of the sample size. However, a real gene regulation network is likely to exhibit a scale-free structure. In such cases, the simulation indicates that a relatively small number of observations, such as [Formula: see text] , is sufficient to allow the accurate prediction of traits from a transcriptome with the lasso. Nature Publishing Group UK 2021-06-01 /pmc/articles/PMC8169869/ /pubmed/34075095 http://dx.doi.org/10.1038/s41598-021-90791-6 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Okinaga, Yuichi
Kyogoku, Daisuke
Kondo, Satoshi
Nagano, Atsushi J.
Hirose, Kei
Relationship between gene regulation network structure and prediction accuracy in high dimensional regression
title Relationship between gene regulation network structure and prediction accuracy in high dimensional regression
title_full Relationship between gene regulation network structure and prediction accuracy in high dimensional regression
title_fullStr Relationship between gene regulation network structure and prediction accuracy in high dimensional regression
title_full_unstemmed Relationship between gene regulation network structure and prediction accuracy in high dimensional regression
title_short Relationship between gene regulation network structure and prediction accuracy in high dimensional regression
title_sort relationship between gene regulation network structure and prediction accuracy in high dimensional regression
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8169869/
https://www.ncbi.nlm.nih.gov/pubmed/34075095
http://dx.doi.org/10.1038/s41598-021-90791-6
work_keys_str_mv AT okinagayuichi relationshipbetweengeneregulationnetworkstructureandpredictionaccuracyinhighdimensionalregression
AT kyogokudaisuke relationshipbetweengeneregulationnetworkstructureandpredictionaccuracyinhighdimensionalregression
AT kondosatoshi relationshipbetweengeneregulationnetworkstructureandpredictionaccuracyinhighdimensionalregression
AT naganoatsushij relationshipbetweengeneregulationnetworkstructureandpredictionaccuracyinhighdimensionalregression
AT hirosekei relationshipbetweengeneregulationnetworkstructureandpredictionaccuracyinhighdimensionalregression