Cargando…

Large scale variation in the rate of germ-line de novo mutation, base composition, divergence and diversity in humans

It has long been suspected that the rate of mutation varies across the human genome at a large scale based on the divergence between humans and other species. However, it is now possible to directly investigate this question using the large number of de novo mutations (DNMs) that have been discovere...

Descripción completa

Detalles Bibliográficos
Autores principales: Smith, Thomas C. A., Arndt, Peter F., Eyre-Walker, Adam
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5891062/
https://www.ncbi.nlm.nih.gov/pubmed/29590096
http://dx.doi.org/10.1371/journal.pgen.1007254
_version_ 1783312959342116864
author Smith, Thomas C. A.
Arndt, Peter F.
Eyre-Walker, Adam
author_facet Smith, Thomas C. A.
Arndt, Peter F.
Eyre-Walker, Adam
author_sort Smith, Thomas C. A.
collection PubMed
description It has long been suspected that the rate of mutation varies across the human genome at a large scale based on the divergence between humans and other species. However, it is now possible to directly investigate this question using the large number of de novo mutations (DNMs) that have been discovered in humans through the sequencing of trios. We investigate a number of questions pertaining to the distribution of mutations using more than 130,000 DNMs from three large datasets. We demonstrate that the amount and pattern of variation differs between datasets at the 1MB and 100KB scales probably as a consequence of differences in sequencing technology and processing. In particular, datasets show different patterns of correlation to genomic variables such as replication time. Never-the-less there are many commonalities between datasets, which likely represent true patterns. We show that there is variation in the mutation rate at the 100KB, 1MB and 10MB scale that cannot be explained by variation at smaller scales, however the level of this variation is modest at large scales–at the 1MB scale we infer that ~90% of regions have a mutation rate within 50% of the mean. Different types of mutation show similar levels of variation and appear to vary in concert which suggests the pattern of mutation is relatively constant across the genome. We demonstrate that variation in the mutation rate does not generate large-scale variation in GC-content, and hence that mutation bias does not maintain the isochore structure of the human genome. We find that genomic features explain less than 40% of the explainable variance in the rate of DNM. As expected the rate of divergence between species is correlated to the rate of DNM. However, the correlations are weaker than expected if all the variation in divergence was due to variation in the mutation rate. We provide evidence that this is due the effect of biased gene conversion on the probability that a mutation will become fixed. In contrast to divergence, we find that most of the variation in diversity can be explained by variation in the mutation rate. Finally, we show that the correlation between divergence and DNM density declines as increasingly divergent species are considered.
format Online
Article
Text
id pubmed-5891062
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-58910622018-04-20 Large scale variation in the rate of germ-line de novo mutation, base composition, divergence and diversity in humans Smith, Thomas C. A. Arndt, Peter F. Eyre-Walker, Adam PLoS Genet Research Article It has long been suspected that the rate of mutation varies across the human genome at a large scale based on the divergence between humans and other species. However, it is now possible to directly investigate this question using the large number of de novo mutations (DNMs) that have been discovered in humans through the sequencing of trios. We investigate a number of questions pertaining to the distribution of mutations using more than 130,000 DNMs from three large datasets. We demonstrate that the amount and pattern of variation differs between datasets at the 1MB and 100KB scales probably as a consequence of differences in sequencing technology and processing. In particular, datasets show different patterns of correlation to genomic variables such as replication time. Never-the-less there are many commonalities between datasets, which likely represent true patterns. We show that there is variation in the mutation rate at the 100KB, 1MB and 10MB scale that cannot be explained by variation at smaller scales, however the level of this variation is modest at large scales–at the 1MB scale we infer that ~90% of regions have a mutation rate within 50% of the mean. Different types of mutation show similar levels of variation and appear to vary in concert which suggests the pattern of mutation is relatively constant across the genome. We demonstrate that variation in the mutation rate does not generate large-scale variation in GC-content, and hence that mutation bias does not maintain the isochore structure of the human genome. We find that genomic features explain less than 40% of the explainable variance in the rate of DNM. As expected the rate of divergence between species is correlated to the rate of DNM. However, the correlations are weaker than expected if all the variation in divergence was due to variation in the mutation rate. We provide evidence that this is due the effect of biased gene conversion on the probability that a mutation will become fixed. In contrast to divergence, we find that most of the variation in diversity can be explained by variation in the mutation rate. Finally, we show that the correlation between divergence and DNM density declines as increasingly divergent species are considered. Public Library of Science 2018-03-28 /pmc/articles/PMC5891062/ /pubmed/29590096 http://dx.doi.org/10.1371/journal.pgen.1007254 Text en © 2018 Smith et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Smith, Thomas C. A.
Arndt, Peter F.
Eyre-Walker, Adam
Large scale variation in the rate of germ-line de novo mutation, base composition, divergence and diversity in humans
title Large scale variation in the rate of germ-line de novo mutation, base composition, divergence and diversity in humans
title_full Large scale variation in the rate of germ-line de novo mutation, base composition, divergence and diversity in humans
title_fullStr Large scale variation in the rate of germ-line de novo mutation, base composition, divergence and diversity in humans
title_full_unstemmed Large scale variation in the rate of germ-line de novo mutation, base composition, divergence and diversity in humans
title_short Large scale variation in the rate of germ-line de novo mutation, base composition, divergence and diversity in humans
title_sort large scale variation in the rate of germ-line de novo mutation, base composition, divergence and diversity in humans
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5891062/
https://www.ncbi.nlm.nih.gov/pubmed/29590096
http://dx.doi.org/10.1371/journal.pgen.1007254
work_keys_str_mv AT smiththomasca largescalevariationintherateofgermlinedenovomutationbasecompositiondivergenceanddiversityinhumans
AT arndtpeterf largescalevariationintherateofgermlinedenovomutationbasecompositiondivergenceanddiversityinhumans
AT eyrewalkeradam largescalevariationintherateofgermlinedenovomutationbasecompositiondivergenceanddiversityinhumans