Cargando…

Haploinsufficiency predictions without study bias

Any given human individual carries multiple genetic variants that disrupt protein-coding genes, through structural variation, as well as nucleotide variants and indels. Predicting the phenotypic consequences of a gene disruption remains a significant challenge. Current approaches employ information...

Descripción completa

Detalles Bibliográficos
Autores principales: Steinberg, Julia, Honti, Frantisek, Meader, Stephen, Webber, Caleb
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4551909/
https://www.ncbi.nlm.nih.gov/pubmed/26001969
http://dx.doi.org/10.1093/nar/gkv474
_version_ 1782387644139831296
author Steinberg, Julia
Honti, Frantisek
Meader, Stephen
Webber, Caleb
author_facet Steinberg, Julia
Honti, Frantisek
Meader, Stephen
Webber, Caleb
author_sort Steinberg, Julia
collection PubMed
description Any given human individual carries multiple genetic variants that disrupt protein-coding genes, through structural variation, as well as nucleotide variants and indels. Predicting the phenotypic consequences of a gene disruption remains a significant challenge. Current approaches employ information from a range of biological networks to predict which human genes are haploinsufficient (meaning two copies are required for normal function) or essential (meaning at least one copy is required for viability). Using recently available study gene sets, we show that these approaches are strongly biased towards providing accurate predictions for well-studied genes. By contrast, we derive a haploinsufficiency score from a combination of unbiased large-scale high-throughput datasets, including gene co-expression and genetic variation in over 6000 human exomes. Our approach provides a haploinsufficiency prediction for over twice as many genes currently unassociated with papers listed in Pubmed as three commonly-used approaches, and outperforms these approaches for predicting haploinsufficiency for less-studied genes. We also show that fine-tuning the predictor on a set of well-studied ‘gold standard’ haploinsufficient genes does not improve the prediction for less-studied genes. This new score can readily be used to prioritize gene disruptions resulting from any genetic variant, including copy number variants, indels and single-nucleotide variants.
format Online
Article
Text
id pubmed-4551909
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-45519092015-08-28 Haploinsufficiency predictions without study bias Steinberg, Julia Honti, Frantisek Meader, Stephen Webber, Caleb Nucleic Acids Res Methods Online Any given human individual carries multiple genetic variants that disrupt protein-coding genes, through structural variation, as well as nucleotide variants and indels. Predicting the phenotypic consequences of a gene disruption remains a significant challenge. Current approaches employ information from a range of biological networks to predict which human genes are haploinsufficient (meaning two copies are required for normal function) or essential (meaning at least one copy is required for viability). Using recently available study gene sets, we show that these approaches are strongly biased towards providing accurate predictions for well-studied genes. By contrast, we derive a haploinsufficiency score from a combination of unbiased large-scale high-throughput datasets, including gene co-expression and genetic variation in over 6000 human exomes. Our approach provides a haploinsufficiency prediction for over twice as many genes currently unassociated with papers listed in Pubmed as three commonly-used approaches, and outperforms these approaches for predicting haploinsufficiency for less-studied genes. We also show that fine-tuning the predictor on a set of well-studied ‘gold standard’ haploinsufficient genes does not improve the prediction for less-studied genes. This new score can readily be used to prioritize gene disruptions resulting from any genetic variant, including copy number variants, indels and single-nucleotide variants. Oxford University Press 2015-09-03 2015-05-22 /pmc/articles/PMC4551909/ /pubmed/26001969 http://dx.doi.org/10.1093/nar/gkv474 Text en © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Steinberg, Julia
Honti, Frantisek
Meader, Stephen
Webber, Caleb
Haploinsufficiency predictions without study bias
title Haploinsufficiency predictions without study bias
title_full Haploinsufficiency predictions without study bias
title_fullStr Haploinsufficiency predictions without study bias
title_full_unstemmed Haploinsufficiency predictions without study bias
title_short Haploinsufficiency predictions without study bias
title_sort haploinsufficiency predictions without study bias
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4551909/
https://www.ncbi.nlm.nih.gov/pubmed/26001969
http://dx.doi.org/10.1093/nar/gkv474
work_keys_str_mv AT steinbergjulia haploinsufficiencypredictionswithoutstudybias
AT hontifrantisek haploinsufficiencypredictionswithoutstudybias
AT meaderstephen haploinsufficiencypredictionswithoutstudybias
AT webbercaleb haploinsufficiencypredictionswithoutstudybias