Cargando…

An unsupervised deep learning framework for predicting human essential genes from population and functional genomic data

BACKGROUND: The ability to accurately predict essential genes intolerant to loss-of-function (LOF) mutations can dramatically improve the identification of disease-associated genes. Recently, there have been numerous computational methods developed to predict human essential genes from population ge...

Descripción completa

Detalles Bibliográficos
Autores principales: LaPolice, Troy M., Huang, Yi-Fei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10506225/
https://www.ncbi.nlm.nih.gov/pubmed/37723435
http://dx.doi.org/10.1186/s12859-023-05481-z
_version_ 1785107076228317184
author LaPolice, Troy M.
Huang, Yi-Fei
author_facet LaPolice, Troy M.
Huang, Yi-Fei
author_sort LaPolice, Troy M.
collection PubMed
description BACKGROUND: The ability to accurately predict essential genes intolerant to loss-of-function (LOF) mutations can dramatically improve the identification of disease-associated genes. Recently, there have been numerous computational methods developed to predict human essential genes from population genomic data. While the existing methods are highly predictive of essential genes of long length, they have limited power in pinpointing short essential genes due to the sparsity of polymorphisms in the human genome. RESULTS: Motivated by the premise that population and functional genomic data may provide complementary evidence for gene essentiality, here we present an evolution-based deep learning model, DeepLOF, to predict essential genes in an unsupervised manner. Unlike previous population genetic methods, DeepLOF utilizes a novel deep learning framework to integrate both population and functional genomic data, allowing us to pinpoint short essential genes that can hardly be predicted from population genomic data alone. Compared with previous methods, DeepLOF shows unmatched performance in predicting ClinGen haploinsufficient genes, mouse essential genes, and essential genes in human cell lines. Notably, at a false positive rate of 5%, DeepLOF detects 50% more ClinGen haploinsufficient genes than previous methods. Furthermore, DeepLOF discovers 109 novel essential genes that are too short to be identified by previous methods. CONCLUSION: The predictive power of DeepLOF shows that it is a compelling computational method to aid in the discovery of essential genes. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05481-z.
format Online
Article
Text
id pubmed-10506225
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-105062252023-09-19 An unsupervised deep learning framework for predicting human essential genes from population and functional genomic data LaPolice, Troy M. Huang, Yi-Fei BMC Bioinformatics Research BACKGROUND: The ability to accurately predict essential genes intolerant to loss-of-function (LOF) mutations can dramatically improve the identification of disease-associated genes. Recently, there have been numerous computational methods developed to predict human essential genes from population genomic data. While the existing methods are highly predictive of essential genes of long length, they have limited power in pinpointing short essential genes due to the sparsity of polymorphisms in the human genome. RESULTS: Motivated by the premise that population and functional genomic data may provide complementary evidence for gene essentiality, here we present an evolution-based deep learning model, DeepLOF, to predict essential genes in an unsupervised manner. Unlike previous population genetic methods, DeepLOF utilizes a novel deep learning framework to integrate both population and functional genomic data, allowing us to pinpoint short essential genes that can hardly be predicted from population genomic data alone. Compared with previous methods, DeepLOF shows unmatched performance in predicting ClinGen haploinsufficient genes, mouse essential genes, and essential genes in human cell lines. Notably, at a false positive rate of 5%, DeepLOF detects 50% more ClinGen haploinsufficient genes than previous methods. Furthermore, DeepLOF discovers 109 novel essential genes that are too short to be identified by previous methods. CONCLUSION: The predictive power of DeepLOF shows that it is a compelling computational method to aid in the discovery of essential genes. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05481-z. BioMed Central 2023-09-18 /pmc/articles/PMC10506225/ /pubmed/37723435 http://dx.doi.org/10.1186/s12859-023-05481-z Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
LaPolice, Troy M.
Huang, Yi-Fei
An unsupervised deep learning framework for predicting human essential genes from population and functional genomic data
title An unsupervised deep learning framework for predicting human essential genes from population and functional genomic data
title_full An unsupervised deep learning framework for predicting human essential genes from population and functional genomic data
title_fullStr An unsupervised deep learning framework for predicting human essential genes from population and functional genomic data
title_full_unstemmed An unsupervised deep learning framework for predicting human essential genes from population and functional genomic data
title_short An unsupervised deep learning framework for predicting human essential genes from population and functional genomic data
title_sort unsupervised deep learning framework for predicting human essential genes from population and functional genomic data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10506225/
https://www.ncbi.nlm.nih.gov/pubmed/37723435
http://dx.doi.org/10.1186/s12859-023-05481-z
work_keys_str_mv AT lapolicetroym anunsuperviseddeeplearningframeworkforpredictinghumanessentialgenesfrompopulationandfunctionalgenomicdata
AT huangyifei anunsuperviseddeeplearningframeworkforpredictinghumanessentialgenesfrompopulationandfunctionalgenomicdata
AT lapolicetroym unsuperviseddeeplearningframeworkforpredictinghumanessentialgenesfrompopulationandfunctionalgenomicdata
AT huangyifei unsuperviseddeeplearningframeworkforpredictinghumanessentialgenesfrompopulationandfunctionalgenomicdata