Cargando…

Multiple haplotype-resolved genomes reveal population patterns of gene and protein diplotypes

To fully understand human biology and link genotype to phenotype, the phase of DNA variants must be known. Here we present a comprehensive analysis of haplotype-resolved genomes to assess the nature and variation of haplotypes and their pairs, diplotypes, in European population samples. We use a set...

Descripción completa

Detalles Bibliográficos
Autores principales: Hoehe, Margret R., Church, George M., Lehrach, Hans, Kroslak, Thomas, Palczewski, Stefanie, Nowick, Katja, Schulz, Sabrina, Suk, Eun-Kyung, Huebsch, Thomas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Pub. Group 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4263165/
https://www.ncbi.nlm.nih.gov/pubmed/25424553
http://dx.doi.org/10.1038/ncomms6569
_version_ 1782348523725914112
author Hoehe, Margret R.
Church, George M.
Lehrach, Hans
Kroslak, Thomas
Palczewski, Stefanie
Nowick, Katja
Schulz, Sabrina
Suk, Eun-Kyung
Huebsch, Thomas
author_facet Hoehe, Margret R.
Church, George M.
Lehrach, Hans
Kroslak, Thomas
Palczewski, Stefanie
Nowick, Katja
Schulz, Sabrina
Suk, Eun-Kyung
Huebsch, Thomas
author_sort Hoehe, Margret R.
collection PubMed
description To fully understand human biology and link genotype to phenotype, the phase of DNA variants must be known. Here we present a comprehensive analysis of haplotype-resolved genomes to assess the nature and variation of haplotypes and their pairs, diplotypes, in European population samples. We use a set of 14 haplotype-resolved genomes generated by fosmid clone-based sequencing, complemented and expanded by up to 372 statistically resolved genomes from the 1000 Genomes Project. We find immense diversity of both haploid and diploid gene forms, up to 4.1 and 3.9 million corresponding to 249 and 235 per gene on average. Less than 15% of autosomal genes have a predominant form. We describe a ‘common diplotypic proteome’, a set of 4,269 genes encoding two different proteins in over 30% of genomes. We show moreover an abundance of cis configurations of mutations in the 386 genomes with an average cis/trans ratio of 60:40, and distinguishable classes of cis- versus trans-abundant genes. This work identifies key features characterizing the diplotypic nature of human genomes and provides a conceptual and analytical framework, rich resources and novel hypotheses on the functional importance of diploidy.
format Online
Article
Text
id pubmed-4263165
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Nature Pub. Group
record_format MEDLINE/PubMed
spelling pubmed-42631652014-12-16 Multiple haplotype-resolved genomes reveal population patterns of gene and protein diplotypes Hoehe, Margret R. Church, George M. Lehrach, Hans Kroslak, Thomas Palczewski, Stefanie Nowick, Katja Schulz, Sabrina Suk, Eun-Kyung Huebsch, Thomas Nat Commun Article To fully understand human biology and link genotype to phenotype, the phase of DNA variants must be known. Here we present a comprehensive analysis of haplotype-resolved genomes to assess the nature and variation of haplotypes and their pairs, diplotypes, in European population samples. We use a set of 14 haplotype-resolved genomes generated by fosmid clone-based sequencing, complemented and expanded by up to 372 statistically resolved genomes from the 1000 Genomes Project. We find immense diversity of both haploid and diploid gene forms, up to 4.1 and 3.9 million corresponding to 249 and 235 per gene on average. Less than 15% of autosomal genes have a predominant form. We describe a ‘common diplotypic proteome’, a set of 4,269 genes encoding two different proteins in over 30% of genomes. We show moreover an abundance of cis configurations of mutations in the 386 genomes with an average cis/trans ratio of 60:40, and distinguishable classes of cis- versus trans-abundant genes. This work identifies key features characterizing the diplotypic nature of human genomes and provides a conceptual and analytical framework, rich resources and novel hypotheses on the functional importance of diploidy. Nature Pub. Group 2014-11-26 /pmc/articles/PMC4263165/ /pubmed/25424553 http://dx.doi.org/10.1038/ncomms6569 Text en Copyright © 2014, Nature Publishing Group, a division of Macmillan Publishers Limited. All Rights Reserved. http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Hoehe, Margret R.
Church, George M.
Lehrach, Hans
Kroslak, Thomas
Palczewski, Stefanie
Nowick, Katja
Schulz, Sabrina
Suk, Eun-Kyung
Huebsch, Thomas
Multiple haplotype-resolved genomes reveal population patterns of gene and protein diplotypes
title Multiple haplotype-resolved genomes reveal population patterns of gene and protein diplotypes
title_full Multiple haplotype-resolved genomes reveal population patterns of gene and protein diplotypes
title_fullStr Multiple haplotype-resolved genomes reveal population patterns of gene and protein diplotypes
title_full_unstemmed Multiple haplotype-resolved genomes reveal population patterns of gene and protein diplotypes
title_short Multiple haplotype-resolved genomes reveal population patterns of gene and protein diplotypes
title_sort multiple haplotype-resolved genomes reveal population patterns of gene and protein diplotypes
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4263165/
https://www.ncbi.nlm.nih.gov/pubmed/25424553
http://dx.doi.org/10.1038/ncomms6569
work_keys_str_mv AT hoehemargretr multiplehaplotyperesolvedgenomesrevealpopulationpatternsofgeneandproteindiplotypes
AT churchgeorgem multiplehaplotyperesolvedgenomesrevealpopulationpatternsofgeneandproteindiplotypes
AT lehrachhans multiplehaplotyperesolvedgenomesrevealpopulationpatternsofgeneandproteindiplotypes
AT kroslakthomas multiplehaplotyperesolvedgenomesrevealpopulationpatternsofgeneandproteindiplotypes
AT palczewskistefanie multiplehaplotyperesolvedgenomesrevealpopulationpatternsofgeneandproteindiplotypes
AT nowickkatja multiplehaplotyperesolvedgenomesrevealpopulationpatternsofgeneandproteindiplotypes
AT schulzsabrina multiplehaplotyperesolvedgenomesrevealpopulationpatternsofgeneandproteindiplotypes
AT sukeunkyung multiplehaplotyperesolvedgenomesrevealpopulationpatternsofgeneandproteindiplotypes
AT huebschthomas multiplehaplotyperesolvedgenomesrevealpopulationpatternsofgeneandproteindiplotypes