Cargando…

Amino Acid Changes in Disease-Associated Variants Differ Radically from Variants Observed in the 1000 Genomes Project Dataset

The 1000 Genomes Project data provides a natural background dataset for amino acid germline mutations in humans. Since the direction of mutation is known, the amino acid exchange matrix generated from the observed nucleotide variants is asymmetric and the mutabilities of the different amino acids ar...

Descripción completa

Detalles Bibliográficos
Autores principales: de Beer, Tjaart A. P., Laskowski, Roman A., Parks, Sarah L., Sipos, Botond, Goldman, Nick, Thornton, Janet M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3861039/
https://www.ncbi.nlm.nih.gov/pubmed/24348229
http://dx.doi.org/10.1371/journal.pcbi.1003382
_version_ 1782295589387501568
author de Beer, Tjaart A. P.
Laskowski, Roman A.
Parks, Sarah L.
Sipos, Botond
Goldman, Nick
Thornton, Janet M.
author_facet de Beer, Tjaart A. P.
Laskowski, Roman A.
Parks, Sarah L.
Sipos, Botond
Goldman, Nick
Thornton, Janet M.
author_sort de Beer, Tjaart A. P.
collection PubMed
description The 1000 Genomes Project data provides a natural background dataset for amino acid germline mutations in humans. Since the direction of mutation is known, the amino acid exchange matrix generated from the observed nucleotide variants is asymmetric and the mutabilities of the different amino acids are very different. These differences predominantly reflect preferences for nucleotide mutations in the DNA (especially the high mutation rate of the CpG dinucleotide, which makes arginine mutability very much higher than other amino acids) rather than selection imposed by protein structure constraints, although there is evidence for the latter as well. The variants occur predominantly on the surface of proteins (82%), with a slight preference for sites which are more exposed and less well conserved than random. Mutations to functional residues occur about half as often as expected by chance. The disease-associated amino acid variant distributions in OMIM are radically different from those expected on the basis of the 1000 Genomes dataset. The disease-associated variants preferentially occur in more conserved sites, compared to 1000 Genomes mutations. Many of the amino acid exchange profiles appear to exhibit an anti-correlation, with common exchanges in one dataset being rare in the other. Disease-associated variants exhibit more extreme differences in amino acid size and hydrophobicity. More modelling of the mutational processes at the nucleotide level is needed, but these observations should contribute to an improved prediction of the effects of specific variants in humans.
format Online
Article
Text
id pubmed-3861039
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-38610392013-12-17 Amino Acid Changes in Disease-Associated Variants Differ Radically from Variants Observed in the 1000 Genomes Project Dataset de Beer, Tjaart A. P. Laskowski, Roman A. Parks, Sarah L. Sipos, Botond Goldman, Nick Thornton, Janet M. PLoS Comput Biol Research Article The 1000 Genomes Project data provides a natural background dataset for amino acid germline mutations in humans. Since the direction of mutation is known, the amino acid exchange matrix generated from the observed nucleotide variants is asymmetric and the mutabilities of the different amino acids are very different. These differences predominantly reflect preferences for nucleotide mutations in the DNA (especially the high mutation rate of the CpG dinucleotide, which makes arginine mutability very much higher than other amino acids) rather than selection imposed by protein structure constraints, although there is evidence for the latter as well. The variants occur predominantly on the surface of proteins (82%), with a slight preference for sites which are more exposed and less well conserved than random. Mutations to functional residues occur about half as often as expected by chance. The disease-associated amino acid variant distributions in OMIM are radically different from those expected on the basis of the 1000 Genomes dataset. The disease-associated variants preferentially occur in more conserved sites, compared to 1000 Genomes mutations. Many of the amino acid exchange profiles appear to exhibit an anti-correlation, with common exchanges in one dataset being rare in the other. Disease-associated variants exhibit more extreme differences in amino acid size and hydrophobicity. More modelling of the mutational processes at the nucleotide level is needed, but these observations should contribute to an improved prediction of the effects of specific variants in humans. Public Library of Science 2013-12-12 /pmc/articles/PMC3861039/ /pubmed/24348229 http://dx.doi.org/10.1371/journal.pcbi.1003382 Text en © 2013 de Beer et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
de Beer, Tjaart A. P.
Laskowski, Roman A.
Parks, Sarah L.
Sipos, Botond
Goldman, Nick
Thornton, Janet M.
Amino Acid Changes in Disease-Associated Variants Differ Radically from Variants Observed in the 1000 Genomes Project Dataset
title Amino Acid Changes in Disease-Associated Variants Differ Radically from Variants Observed in the 1000 Genomes Project Dataset
title_full Amino Acid Changes in Disease-Associated Variants Differ Radically from Variants Observed in the 1000 Genomes Project Dataset
title_fullStr Amino Acid Changes in Disease-Associated Variants Differ Radically from Variants Observed in the 1000 Genomes Project Dataset
title_full_unstemmed Amino Acid Changes in Disease-Associated Variants Differ Radically from Variants Observed in the 1000 Genomes Project Dataset
title_short Amino Acid Changes in Disease-Associated Variants Differ Radically from Variants Observed in the 1000 Genomes Project Dataset
title_sort amino acid changes in disease-associated variants differ radically from variants observed in the 1000 genomes project dataset
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3861039/
https://www.ncbi.nlm.nih.gov/pubmed/24348229
http://dx.doi.org/10.1371/journal.pcbi.1003382
work_keys_str_mv AT debeertjaartap aminoacidchangesindiseaseassociatedvariantsdifferradicallyfromvariantsobservedinthe1000genomesprojectdataset
AT laskowskiromana aminoacidchangesindiseaseassociatedvariantsdifferradicallyfromvariantsobservedinthe1000genomesprojectdataset
AT parkssarahl aminoacidchangesindiseaseassociatedvariantsdifferradicallyfromvariantsobservedinthe1000genomesprojectdataset
AT siposbotond aminoacidchangesindiseaseassociatedvariantsdifferradicallyfromvariantsobservedinthe1000genomesprojectdataset
AT goldmannick aminoacidchangesindiseaseassociatedvariantsdifferradicallyfromvariantsobservedinthe1000genomesprojectdataset
AT thorntonjanetm aminoacidchangesindiseaseassociatedvariantsdifferradicallyfromvariantsobservedinthe1000genomesprojectdataset