Cargando…
The origin, evolution, and functional impact of short insertion–deletion variants identified in 179 human genomes
Short insertions and deletions (indels) are the second most abundant form of human genetic variation, but our understanding of their origins and functional effects lags behind that of other types of variants. Using population-scale sequencing, we have identified a high-quality set of 1.6 million ind...
Autores principales: | , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory Press
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3638132/ https://www.ncbi.nlm.nih.gov/pubmed/23478400 http://dx.doi.org/10.1101/gr.148718.112 |
_version_ | 1782475799365943296 |
---|---|
author | Montgomery, Stephen B. Goode, David L. Kvikstad, Erika Albers, Cornelis A. Zhang, Zhengdong D. Mu, Xinmeng Jasmine Ananda, Guruprasad Howie, Bryan Karczewski, Konrad J. Smith, Kevin S. Anaya, Vanessa Richardson, Rhea Davis, Joe MacArthur, Daniel G. Sidow, Arend Duret, Laurent Gerstein, Mark Makova, Kateryna D. Marchini, Jonathan McVean, Gil Lunter, Gerton |
author_facet | Montgomery, Stephen B. Goode, David L. Kvikstad, Erika Albers, Cornelis A. Zhang, Zhengdong D. Mu, Xinmeng Jasmine Ananda, Guruprasad Howie, Bryan Karczewski, Konrad J. Smith, Kevin S. Anaya, Vanessa Richardson, Rhea Davis, Joe MacArthur, Daniel G. Sidow, Arend Duret, Laurent Gerstein, Mark Makova, Kateryna D. Marchini, Jonathan McVean, Gil Lunter, Gerton |
author_sort | Montgomery, Stephen B. |
collection | PubMed |
description | Short insertions and deletions (indels) are the second most abundant form of human genetic variation, but our understanding of their origins and functional effects lags behind that of other types of variants. Using population-scale sequencing, we have identified a high-quality set of 1.6 million indels from 179 individuals representing three diverse human populations. We show that rates of indel mutagenesis are highly heterogeneous, with 43%–48% of indels occurring in 4.03% of the genome, whereas in the remaining 96% their prevalence is 16 times lower than SNPs. Polymerase slippage can explain upwards of three-fourths of all indels, with the remainder being mostly simple deletions in complex sequence. However, insertions do occur and are significantly associated with pseudo-palindromic sequence features compatible with the fork stalling and template switching (FoSTeS) mechanism more commonly associated with large structural variations. We introduce a quantitative model of polymerase slippage, which enables us to identify indel-hypermutagenic protein-coding genes, some of which are associated with recurrent mutations leading to disease. Accounting for mutational rate heterogeneity due to sequence context, we find that indels across functional sequence are generally subject to stronger purifying selection than SNPs. We find that indel length modulates selection strength, and that indels affecting multiple functionally constrained nucleotides undergo stronger purifying selection. We further find that indels are enriched in associations with gene expression and find evidence for a contribution of nonsense-mediated decay. Finally, we show that indels can be integrated in existing genome-wide association studies (GWAS); although we do not find direct evidence that potentially causal protein-coding indels are enriched with associations to known disease-associated SNPs, our findings suggest that the causal variant underlying some of these associations may be indels. |
format | Online Article Text |
id | pubmed-3638132 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Cold Spring Harbor Laboratory Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-36381322013-05-04 The origin, evolution, and functional impact of short insertion–deletion variants identified in 179 human genomes Montgomery, Stephen B. Goode, David L. Kvikstad, Erika Albers, Cornelis A. Zhang, Zhengdong D. Mu, Xinmeng Jasmine Ananda, Guruprasad Howie, Bryan Karczewski, Konrad J. Smith, Kevin S. Anaya, Vanessa Richardson, Rhea Davis, Joe MacArthur, Daniel G. Sidow, Arend Duret, Laurent Gerstein, Mark Makova, Kateryna D. Marchini, Jonathan McVean, Gil Lunter, Gerton Genome Res Research Short insertions and deletions (indels) are the second most abundant form of human genetic variation, but our understanding of their origins and functional effects lags behind that of other types of variants. Using population-scale sequencing, we have identified a high-quality set of 1.6 million indels from 179 individuals representing three diverse human populations. We show that rates of indel mutagenesis are highly heterogeneous, with 43%–48% of indels occurring in 4.03% of the genome, whereas in the remaining 96% their prevalence is 16 times lower than SNPs. Polymerase slippage can explain upwards of three-fourths of all indels, with the remainder being mostly simple deletions in complex sequence. However, insertions do occur and are significantly associated with pseudo-palindromic sequence features compatible with the fork stalling and template switching (FoSTeS) mechanism more commonly associated with large structural variations. We introduce a quantitative model of polymerase slippage, which enables us to identify indel-hypermutagenic protein-coding genes, some of which are associated with recurrent mutations leading to disease. Accounting for mutational rate heterogeneity due to sequence context, we find that indels across functional sequence are generally subject to stronger purifying selection than SNPs. We find that indel length modulates selection strength, and that indels affecting multiple functionally constrained nucleotides undergo stronger purifying selection. We further find that indels are enriched in associations with gene expression and find evidence for a contribution of nonsense-mediated decay. Finally, we show that indels can be integrated in existing genome-wide association studies (GWAS); although we do not find direct evidence that potentially causal protein-coding indels are enriched with associations to known disease-associated SNPs, our findings suggest that the causal variant underlying some of these associations may be indels. Cold Spring Harbor Laboratory Press 2013-05 /pmc/articles/PMC3638132/ /pubmed/23478400 http://dx.doi.org/10.1101/gr.148718.112 Text en © 2013, Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by-nc/3.0/ This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at http://creativecommons.org/licenses/by-nc/3.0/. |
spellingShingle | Research Montgomery, Stephen B. Goode, David L. Kvikstad, Erika Albers, Cornelis A. Zhang, Zhengdong D. Mu, Xinmeng Jasmine Ananda, Guruprasad Howie, Bryan Karczewski, Konrad J. Smith, Kevin S. Anaya, Vanessa Richardson, Rhea Davis, Joe MacArthur, Daniel G. Sidow, Arend Duret, Laurent Gerstein, Mark Makova, Kateryna D. Marchini, Jonathan McVean, Gil Lunter, Gerton The origin, evolution, and functional impact of short insertion–deletion variants identified in 179 human genomes |
title | The origin, evolution, and functional impact of short insertion–deletion variants identified in 179 human genomes |
title_full | The origin, evolution, and functional impact of short insertion–deletion variants identified in 179 human genomes |
title_fullStr | The origin, evolution, and functional impact of short insertion–deletion variants identified in 179 human genomes |
title_full_unstemmed | The origin, evolution, and functional impact of short insertion–deletion variants identified in 179 human genomes |
title_short | The origin, evolution, and functional impact of short insertion–deletion variants identified in 179 human genomes |
title_sort | origin, evolution, and functional impact of short insertion–deletion variants identified in 179 human genomes |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3638132/ https://www.ncbi.nlm.nih.gov/pubmed/23478400 http://dx.doi.org/10.1101/gr.148718.112 |
work_keys_str_mv | AT montgomerystephenb theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT goodedavidl theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT kvikstaderika theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT alberscornelisa theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT zhangzhengdongd theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT muxinmengjasmine theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT anandaguruprasad theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT howiebryan theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT karczewskikonradj theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT smithkevins theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT anayavanessa theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT richardsonrhea theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT davisjoe theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT macarthurdanielg theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT sidowarend theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT duretlaurent theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT gersteinmark theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT makovakaterynad theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT marchinijonathan theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT mcveangil theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT luntergerton theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT montgomerystephenb originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT goodedavidl originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT kvikstaderika originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT alberscornelisa originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT zhangzhengdongd originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT muxinmengjasmine originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT anandaguruprasad originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT howiebryan originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT karczewskikonradj originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT smithkevins originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT anayavanessa originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT richardsonrhea originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT davisjoe originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT macarthurdanielg originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT sidowarend originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT duretlaurent originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT gersteinmark originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT makovakaterynad originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT marchinijonathan originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT mcveangil originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes AT luntergerton originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes |