Cargando…
Haploinsufficiency predictions without study bias
Any given human individual carries multiple genetic variants that disrupt protein-coding genes, through structural variation, as well as nucleotide variants and indels. Predicting the phenotypic consequences of a gene disruption remains a significant challenge. Current approaches employ information...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4551909/ https://www.ncbi.nlm.nih.gov/pubmed/26001969 http://dx.doi.org/10.1093/nar/gkv474 |
_version_ | 1782387644139831296 |
---|---|
author | Steinberg, Julia Honti, Frantisek Meader, Stephen Webber, Caleb |
author_facet | Steinberg, Julia Honti, Frantisek Meader, Stephen Webber, Caleb |
author_sort | Steinberg, Julia |
collection | PubMed |
description | Any given human individual carries multiple genetic variants that disrupt protein-coding genes, through structural variation, as well as nucleotide variants and indels. Predicting the phenotypic consequences of a gene disruption remains a significant challenge. Current approaches employ information from a range of biological networks to predict which human genes are haploinsufficient (meaning two copies are required for normal function) or essential (meaning at least one copy is required for viability). Using recently available study gene sets, we show that these approaches are strongly biased towards providing accurate predictions for well-studied genes. By contrast, we derive a haploinsufficiency score from a combination of unbiased large-scale high-throughput datasets, including gene co-expression and genetic variation in over 6000 human exomes. Our approach provides a haploinsufficiency prediction for over twice as many genes currently unassociated with papers listed in Pubmed as three commonly-used approaches, and outperforms these approaches for predicting haploinsufficiency for less-studied genes. We also show that fine-tuning the predictor on a set of well-studied ‘gold standard’ haploinsufficient genes does not improve the prediction for less-studied genes. This new score can readily be used to prioritize gene disruptions resulting from any genetic variant, including copy number variants, indels and single-nucleotide variants. |
format | Online Article Text |
id | pubmed-4551909 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-45519092015-08-28 Haploinsufficiency predictions without study bias Steinberg, Julia Honti, Frantisek Meader, Stephen Webber, Caleb Nucleic Acids Res Methods Online Any given human individual carries multiple genetic variants that disrupt protein-coding genes, through structural variation, as well as nucleotide variants and indels. Predicting the phenotypic consequences of a gene disruption remains a significant challenge. Current approaches employ information from a range of biological networks to predict which human genes are haploinsufficient (meaning two copies are required for normal function) or essential (meaning at least one copy is required for viability). Using recently available study gene sets, we show that these approaches are strongly biased towards providing accurate predictions for well-studied genes. By contrast, we derive a haploinsufficiency score from a combination of unbiased large-scale high-throughput datasets, including gene co-expression and genetic variation in over 6000 human exomes. Our approach provides a haploinsufficiency prediction for over twice as many genes currently unassociated with papers listed in Pubmed as three commonly-used approaches, and outperforms these approaches for predicting haploinsufficiency for less-studied genes. We also show that fine-tuning the predictor on a set of well-studied ‘gold standard’ haploinsufficient genes does not improve the prediction for less-studied genes. This new score can readily be used to prioritize gene disruptions resulting from any genetic variant, including copy number variants, indels and single-nucleotide variants. Oxford University Press 2015-09-03 2015-05-22 /pmc/articles/PMC4551909/ /pubmed/26001969 http://dx.doi.org/10.1093/nar/gkv474 Text en © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methods Online Steinberg, Julia Honti, Frantisek Meader, Stephen Webber, Caleb Haploinsufficiency predictions without study bias |
title | Haploinsufficiency predictions without study bias |
title_full | Haploinsufficiency predictions without study bias |
title_fullStr | Haploinsufficiency predictions without study bias |
title_full_unstemmed | Haploinsufficiency predictions without study bias |
title_short | Haploinsufficiency predictions without study bias |
title_sort | haploinsufficiency predictions without study bias |
topic | Methods Online |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4551909/ https://www.ncbi.nlm.nih.gov/pubmed/26001969 http://dx.doi.org/10.1093/nar/gkv474 |
work_keys_str_mv | AT steinbergjulia haploinsufficiencypredictionswithoutstudybias AT hontifrantisek haploinsufficiencypredictionswithoutstudybias AT meaderstephen haploinsufficiencypredictionswithoutstudybias AT webbercaleb haploinsufficiencypredictionswithoutstudybias |