Cargando…

Decoding the effects of synonymous variants

Synonymous single nucleotide variants (sSNVs) are common in the human genome but are often overlooked. However, sSNVs can have significant biological impact and may lead to disease. Existing computational methods for evaluating the effect of sSNVs suffer from the lack of gold-standard training/evalu...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zeng, Zishuo, Aptekmann, Ariel A, Bromberg, Yana
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2021
Materias:	Computational Biology
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8682775/ https://www.ncbi.nlm.nih.gov/pubmed/34850938 http://dx.doi.org/10.1093/nar/gkab1159

_version_	1784617294655127552
author	Zeng, Zishuo Aptekmann, Ariel A Bromberg, Yana
author_facet	Zeng, Zishuo Aptekmann, Ariel A Bromberg, Yana
author_sort	Zeng, Zishuo
collection	PubMed
description	Synonymous single nucleotide variants (sSNVs) are common in the human genome but are often overlooked. However, sSNVs can have significant biological impact and may lead to disease. Existing computational methods for evaluating the effect of sSNVs suffer from the lack of gold-standard training/evaluation data and exhibit over-reliance on sequence conservation signals. We developed synVep (synonymous Variant effect predictor), a machine learning-based method that overcomes both of these limitations. Our training data was a combination of variants reported by gnomAD (observed) and those unreported, but possible in the human genome (generated). We used positive-unlabeled learning to purify the generated variant set of any likely unobservable variants. We then trained two sequential extreme gradient boosting models to identify subsets of the remaining variants putatively enriched and depleted in effect. Our method attained 90% precision/recall on a previously unseen set of variants. Furthermore, although synVep does not explicitly use conservation, its scores correlated with evolutionary distances between orthologs in cross-species variation analysis. synVep was also able to differentiate pathogenic vs. benign variants, as well as splice-site disrupting variants (SDV) vs. non-SDVs. Thus, synVep provides an important improvement in annotation of sSNVs, allowing users to focus on variants that most likely harbor effects.
format	Online Article Text
id	pubmed-8682775
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-86827752021-12-20 Decoding the effects of synonymous variants Zeng, Zishuo Aptekmann, Ariel A Bromberg, Yana Nucleic Acids Res Computational Biology Synonymous single nucleotide variants (sSNVs) are common in the human genome but are often overlooked. However, sSNVs can have significant biological impact and may lead to disease. Existing computational methods for evaluating the effect of sSNVs suffer from the lack of gold-standard training/evaluation data and exhibit over-reliance on sequence conservation signals. We developed synVep (synonymous Variant effect predictor), a machine learning-based method that overcomes both of these limitations. Our training data was a combination of variants reported by gnomAD (observed) and those unreported, but possible in the human genome (generated). We used positive-unlabeled learning to purify the generated variant set of any likely unobservable variants. We then trained two sequential extreme gradient boosting models to identify subsets of the remaining variants putatively enriched and depleted in effect. Our method attained 90% precision/recall on a previously unseen set of variants. Furthermore, although synVep does not explicitly use conservation, its scores correlated with evolutionary distances between orthologs in cross-species variation analysis. synVep was also able to differentiate pathogenic vs. benign variants, as well as splice-site disrupting variants (SDV) vs. non-SDVs. Thus, synVep provides an important improvement in annotation of sSNVs, allowing users to focus on variants that most likely harbor effects. Oxford University Press 2021-11-30 /pmc/articles/PMC8682775/ /pubmed/34850938 http://dx.doi.org/10.1093/nar/gkab1159 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of Nucleic Acids Research. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Computational Biology Zeng, Zishuo Aptekmann, Ariel A Bromberg, Yana Decoding the effects of synonymous variants
title	Decoding the effects of synonymous variants
title_full	Decoding the effects of synonymous variants
title_fullStr	Decoding the effects of synonymous variants
title_full_unstemmed	Decoding the effects of synonymous variants
title_short	Decoding the effects of synonymous variants
title_sort	decoding the effects of synonymous variants
topic	Computational Biology
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8682775/ https://www.ncbi.nlm.nih.gov/pubmed/34850938 http://dx.doi.org/10.1093/nar/gkab1159
work_keys_str_mv	AT zengzishuo decodingtheeffectsofsynonymousvariants AT aptekmannariela decodingtheeffectsofsynonymousvariants AT brombergyana decodingtheeffectsofsynonymousvariants

Decoding the effects of synonymous variants

Ejemplares similares