Cargando…

The origin, evolution, and functional impact of short insertion–deletion variants identified in 179 human genomes

Short insertions and deletions (indels) are the second most abundant form of human genetic variation, but our understanding of their origins and functional effects lags behind that of other types of variants. Using population-scale sequencing, we have identified a high-quality set of 1.6 million ind...

Descripción completa

Detalles Bibliográficos
Autores principales: Montgomery, Stephen B., Goode, David L., Kvikstad, Erika, Albers, Cornelis A., Zhang, Zhengdong D., Mu, Xinmeng Jasmine, Ananda, Guruprasad, Howie, Bryan, Karczewski, Konrad J., Smith, Kevin S., Anaya, Vanessa, Richardson, Rhea, Davis, Joe, MacArthur, Daniel G., Sidow, Arend, Duret, Laurent, Gerstein, Mark, Makova, Kateryna D., Marchini, Jonathan, McVean, Gil, Lunter, Gerton
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3638132/
https://www.ncbi.nlm.nih.gov/pubmed/23478400
http://dx.doi.org/10.1101/gr.148718.112
_version_ 1782475799365943296
author Montgomery, Stephen B.
Goode, David L.
Kvikstad, Erika
Albers, Cornelis A.
Zhang, Zhengdong D.
Mu, Xinmeng Jasmine
Ananda, Guruprasad
Howie, Bryan
Karczewski, Konrad J.
Smith, Kevin S.
Anaya, Vanessa
Richardson, Rhea
Davis, Joe
MacArthur, Daniel G.
Sidow, Arend
Duret, Laurent
Gerstein, Mark
Makova, Kateryna D.
Marchini, Jonathan
McVean, Gil
Lunter, Gerton
author_facet Montgomery, Stephen B.
Goode, David L.
Kvikstad, Erika
Albers, Cornelis A.
Zhang, Zhengdong D.
Mu, Xinmeng Jasmine
Ananda, Guruprasad
Howie, Bryan
Karczewski, Konrad J.
Smith, Kevin S.
Anaya, Vanessa
Richardson, Rhea
Davis, Joe
MacArthur, Daniel G.
Sidow, Arend
Duret, Laurent
Gerstein, Mark
Makova, Kateryna D.
Marchini, Jonathan
McVean, Gil
Lunter, Gerton
author_sort Montgomery, Stephen B.
collection PubMed
description Short insertions and deletions (indels) are the second most abundant form of human genetic variation, but our understanding of their origins and functional effects lags behind that of other types of variants. Using population-scale sequencing, we have identified a high-quality set of 1.6 million indels from 179 individuals representing three diverse human populations. We show that rates of indel mutagenesis are highly heterogeneous, with 43%–48% of indels occurring in 4.03% of the genome, whereas in the remaining 96% their prevalence is 16 times lower than SNPs. Polymerase slippage can explain upwards of three-fourths of all indels, with the remainder being mostly simple deletions in complex sequence. However, insertions do occur and are significantly associated with pseudo-palindromic sequence features compatible with the fork stalling and template switching (FoSTeS) mechanism more commonly associated with large structural variations. We introduce a quantitative model of polymerase slippage, which enables us to identify indel-hypermutagenic protein-coding genes, some of which are associated with recurrent mutations leading to disease. Accounting for mutational rate heterogeneity due to sequence context, we find that indels across functional sequence are generally subject to stronger purifying selection than SNPs. We find that indel length modulates selection strength, and that indels affecting multiple functionally constrained nucleotides undergo stronger purifying selection. We further find that indels are enriched in associations with gene expression and find evidence for a contribution of nonsense-mediated decay. Finally, we show that indels can be integrated in existing genome-wide association studies (GWAS); although we do not find direct evidence that potentially causal protein-coding indels are enriched with associations to known disease-associated SNPs, our findings suggest that the causal variant underlying some of these associations may be indels.
format Online
Article
Text
id pubmed-3638132
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-36381322013-05-04 The origin, evolution, and functional impact of short insertion–deletion variants identified in 179 human genomes Montgomery, Stephen B. Goode, David L. Kvikstad, Erika Albers, Cornelis A. Zhang, Zhengdong D. Mu, Xinmeng Jasmine Ananda, Guruprasad Howie, Bryan Karczewski, Konrad J. Smith, Kevin S. Anaya, Vanessa Richardson, Rhea Davis, Joe MacArthur, Daniel G. Sidow, Arend Duret, Laurent Gerstein, Mark Makova, Kateryna D. Marchini, Jonathan McVean, Gil Lunter, Gerton Genome Res Research Short insertions and deletions (indels) are the second most abundant form of human genetic variation, but our understanding of their origins and functional effects lags behind that of other types of variants. Using population-scale sequencing, we have identified a high-quality set of 1.6 million indels from 179 individuals representing three diverse human populations. We show that rates of indel mutagenesis are highly heterogeneous, with 43%–48% of indels occurring in 4.03% of the genome, whereas in the remaining 96% their prevalence is 16 times lower than SNPs. Polymerase slippage can explain upwards of three-fourths of all indels, with the remainder being mostly simple deletions in complex sequence. However, insertions do occur and are significantly associated with pseudo-palindromic sequence features compatible with the fork stalling and template switching (FoSTeS) mechanism more commonly associated with large structural variations. We introduce a quantitative model of polymerase slippage, which enables us to identify indel-hypermutagenic protein-coding genes, some of which are associated with recurrent mutations leading to disease. Accounting for mutational rate heterogeneity due to sequence context, we find that indels across functional sequence are generally subject to stronger purifying selection than SNPs. We find that indel length modulates selection strength, and that indels affecting multiple functionally constrained nucleotides undergo stronger purifying selection. We further find that indels are enriched in associations with gene expression and find evidence for a contribution of nonsense-mediated decay. Finally, we show that indels can be integrated in existing genome-wide association studies (GWAS); although we do not find direct evidence that potentially causal protein-coding indels are enriched with associations to known disease-associated SNPs, our findings suggest that the causal variant underlying some of these associations may be indels. Cold Spring Harbor Laboratory Press 2013-05 /pmc/articles/PMC3638132/ /pubmed/23478400 http://dx.doi.org/10.1101/gr.148718.112 Text en © 2013, Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by-nc/3.0/ This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at http://creativecommons.org/licenses/by-nc/3.0/.
spellingShingle Research
Montgomery, Stephen B.
Goode, David L.
Kvikstad, Erika
Albers, Cornelis A.
Zhang, Zhengdong D.
Mu, Xinmeng Jasmine
Ananda, Guruprasad
Howie, Bryan
Karczewski, Konrad J.
Smith, Kevin S.
Anaya, Vanessa
Richardson, Rhea
Davis, Joe
MacArthur, Daniel G.
Sidow, Arend
Duret, Laurent
Gerstein, Mark
Makova, Kateryna D.
Marchini, Jonathan
McVean, Gil
Lunter, Gerton
The origin, evolution, and functional impact of short insertion–deletion variants identified in 179 human genomes
title The origin, evolution, and functional impact of short insertion–deletion variants identified in 179 human genomes
title_full The origin, evolution, and functional impact of short insertion–deletion variants identified in 179 human genomes
title_fullStr The origin, evolution, and functional impact of short insertion–deletion variants identified in 179 human genomes
title_full_unstemmed The origin, evolution, and functional impact of short insertion–deletion variants identified in 179 human genomes
title_short The origin, evolution, and functional impact of short insertion–deletion variants identified in 179 human genomes
title_sort origin, evolution, and functional impact of short insertion–deletion variants identified in 179 human genomes
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3638132/
https://www.ncbi.nlm.nih.gov/pubmed/23478400
http://dx.doi.org/10.1101/gr.148718.112
work_keys_str_mv AT montgomerystephenb theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT goodedavidl theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT kvikstaderika theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT alberscornelisa theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT zhangzhengdongd theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT muxinmengjasmine theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT anandaguruprasad theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT howiebryan theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT karczewskikonradj theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT smithkevins theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT anayavanessa theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT richardsonrhea theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT davisjoe theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT macarthurdanielg theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT sidowarend theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT duretlaurent theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT gersteinmark theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT makovakaterynad theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT marchinijonathan theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT mcveangil theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT luntergerton theoriginevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT montgomerystephenb originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT goodedavidl originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT kvikstaderika originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT alberscornelisa originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT zhangzhengdongd originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT muxinmengjasmine originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT anandaguruprasad originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT howiebryan originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT karczewskikonradj originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT smithkevins originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT anayavanessa originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT richardsonrhea originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT davisjoe originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT macarthurdanielg originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT sidowarend originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT duretlaurent originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT gersteinmark originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT makovakaterynad originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT marchinijonathan originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT mcveangil originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes
AT luntergerton originevolutionandfunctionalimpactofshortinsertiondeletionvariantsidentifiedin179humangenomes