Cargando…

Sequences of 95 human MHC haplotypes reveal extreme coding variation in genes other than highly polymorphic HLA class I and II

The most polymorphic part of the human genome, the MHC, encodes over 160 proteins of diverse function. Half of them, including the HLA class I and II genes, are directly involved in immune responses. Consequently, the MHC region strongly associates with numerous diseases and clinical therapies. Noto...

Descripción completa

Detalles Bibliográficos
Autores principales: Norman, Paul J., Norberg, Steven J., Guethlein, Lisbeth A., Nemat-Gorgani, Neda, Royce, Thomas, Wroblewski, Emily E., Dunn, Tamsen, Mann, Tobias, Alicata, Claudia, Hollenbach, Jill A., Chang, Weihua, Shults Won, Melissa, Gunderson, Kevin L., Abi-Rached, Laurent, Ronaghi, Mostafa, Parham, Peter
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5411776/
https://www.ncbi.nlm.nih.gov/pubmed/28360230
http://dx.doi.org/10.1101/gr.213538.116
_version_ 1783232864054149120
author Norman, Paul J.
Norberg, Steven J.
Guethlein, Lisbeth A.
Nemat-Gorgani, Neda
Royce, Thomas
Wroblewski, Emily E.
Dunn, Tamsen
Mann, Tobias
Alicata, Claudia
Hollenbach, Jill A.
Chang, Weihua
Shults Won, Melissa
Gunderson, Kevin L.
Abi-Rached, Laurent
Ronaghi, Mostafa
Parham, Peter
author_facet Norman, Paul J.
Norberg, Steven J.
Guethlein, Lisbeth A.
Nemat-Gorgani, Neda
Royce, Thomas
Wroblewski, Emily E.
Dunn, Tamsen
Mann, Tobias
Alicata, Claudia
Hollenbach, Jill A.
Chang, Weihua
Shults Won, Melissa
Gunderson, Kevin L.
Abi-Rached, Laurent
Ronaghi, Mostafa
Parham, Peter
author_sort Norman, Paul J.
collection PubMed
description The most polymorphic part of the human genome, the MHC, encodes over 160 proteins of diverse function. Half of them, including the HLA class I and II genes, are directly involved in immune responses. Consequently, the MHC region strongly associates with numerous diseases and clinical therapies. Notoriously, the MHC region has been intractable to high-throughput analysis at complete sequence resolution, and current reference haplotypes are inadequate for large-scale studies. To address these challenges, we developed a method that specifically captures and sequences the 4.8-Mbp MHC region from genomic DNA. For 95 MHC homozygous cell lines we assembled, de novo, a set of high-fidelity contigs and a sequence scaffold, representing a mean 98% of the target region. Included are six alternative MHC reference sequences of the human genome that we completed and refined. Characterization of the sequence and structural diversity of the MHC region shows the approach accurately determines the sequences of the highly polymorphic HLA class I and HLA class II genes and the complex structural diversity of complement factor C4A/C4B. It has also uncovered extensive and unexpected diversity in other MHC genes; an example is MUC22, which encodes a lung mucin and exhibits more coding sequence alleles than any HLA class I or II gene studied here. More than 60% of the coding sequence alleles analyzed were previously uncharacterized. We have created a substantial database of robust reference MHC haplotype sequences that will enable future population scale studies of this complicated and clinically important region of the human genome.
format Online
Article
Text
id pubmed-5411776
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-54117762017-11-01 Sequences of 95 human MHC haplotypes reveal extreme coding variation in genes other than highly polymorphic HLA class I and II Norman, Paul J. Norberg, Steven J. Guethlein, Lisbeth A. Nemat-Gorgani, Neda Royce, Thomas Wroblewski, Emily E. Dunn, Tamsen Mann, Tobias Alicata, Claudia Hollenbach, Jill A. Chang, Weihua Shults Won, Melissa Gunderson, Kevin L. Abi-Rached, Laurent Ronaghi, Mostafa Parham, Peter Genome Res Method The most polymorphic part of the human genome, the MHC, encodes over 160 proteins of diverse function. Half of them, including the HLA class I and II genes, are directly involved in immune responses. Consequently, the MHC region strongly associates with numerous diseases and clinical therapies. Notoriously, the MHC region has been intractable to high-throughput analysis at complete sequence resolution, and current reference haplotypes are inadequate for large-scale studies. To address these challenges, we developed a method that specifically captures and sequences the 4.8-Mbp MHC region from genomic DNA. For 95 MHC homozygous cell lines we assembled, de novo, a set of high-fidelity contigs and a sequence scaffold, representing a mean 98% of the target region. Included are six alternative MHC reference sequences of the human genome that we completed and refined. Characterization of the sequence and structural diversity of the MHC region shows the approach accurately determines the sequences of the highly polymorphic HLA class I and HLA class II genes and the complex structural diversity of complement factor C4A/C4B. It has also uncovered extensive and unexpected diversity in other MHC genes; an example is MUC22, which encodes a lung mucin and exhibits more coding sequence alleles than any HLA class I or II gene studied here. More than 60% of the coding sequence alleles analyzed were previously uncharacterized. We have created a substantial database of robust reference MHC haplotype sequences that will enable future population scale studies of this complicated and clinically important region of the human genome. Cold Spring Harbor Laboratory Press 2017-05 /pmc/articles/PMC5411776/ /pubmed/28360230 http://dx.doi.org/10.1101/gr.213538.116 Text en © 2017 Norman et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by-nc/4.0/ This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.
spellingShingle Method
Norman, Paul J.
Norberg, Steven J.
Guethlein, Lisbeth A.
Nemat-Gorgani, Neda
Royce, Thomas
Wroblewski, Emily E.
Dunn, Tamsen
Mann, Tobias
Alicata, Claudia
Hollenbach, Jill A.
Chang, Weihua
Shults Won, Melissa
Gunderson, Kevin L.
Abi-Rached, Laurent
Ronaghi, Mostafa
Parham, Peter
Sequences of 95 human MHC haplotypes reveal extreme coding variation in genes other than highly polymorphic HLA class I and II
title Sequences of 95 human MHC haplotypes reveal extreme coding variation in genes other than highly polymorphic HLA class I and II
title_full Sequences of 95 human MHC haplotypes reveal extreme coding variation in genes other than highly polymorphic HLA class I and II
title_fullStr Sequences of 95 human MHC haplotypes reveal extreme coding variation in genes other than highly polymorphic HLA class I and II
title_full_unstemmed Sequences of 95 human MHC haplotypes reveal extreme coding variation in genes other than highly polymorphic HLA class I and II
title_short Sequences of 95 human MHC haplotypes reveal extreme coding variation in genes other than highly polymorphic HLA class I and II
title_sort sequences of 95 human mhc haplotypes reveal extreme coding variation in genes other than highly polymorphic hla class i and ii
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5411776/
https://www.ncbi.nlm.nih.gov/pubmed/28360230
http://dx.doi.org/10.1101/gr.213538.116
work_keys_str_mv AT normanpaulj sequencesof95humanmhchaplotypesrevealextremecodingvariationingenesotherthanhighlypolymorphichlaclassiandii
AT norbergstevenj sequencesof95humanmhchaplotypesrevealextremecodingvariationingenesotherthanhighlypolymorphichlaclassiandii
AT guethleinlisbetha sequencesof95humanmhchaplotypesrevealextremecodingvariationingenesotherthanhighlypolymorphichlaclassiandii
AT nematgorganineda sequencesof95humanmhchaplotypesrevealextremecodingvariationingenesotherthanhighlypolymorphichlaclassiandii
AT roycethomas sequencesof95humanmhchaplotypesrevealextremecodingvariationingenesotherthanhighlypolymorphichlaclassiandii
AT wroblewskiemilye sequencesof95humanmhchaplotypesrevealextremecodingvariationingenesotherthanhighlypolymorphichlaclassiandii
AT dunntamsen sequencesof95humanmhchaplotypesrevealextremecodingvariationingenesotherthanhighlypolymorphichlaclassiandii
AT manntobias sequencesof95humanmhchaplotypesrevealextremecodingvariationingenesotherthanhighlypolymorphichlaclassiandii
AT alicataclaudia sequencesof95humanmhchaplotypesrevealextremecodingvariationingenesotherthanhighlypolymorphichlaclassiandii
AT hollenbachjilla sequencesof95humanmhchaplotypesrevealextremecodingvariationingenesotherthanhighlypolymorphichlaclassiandii
AT changweihua sequencesof95humanmhchaplotypesrevealextremecodingvariationingenesotherthanhighlypolymorphichlaclassiandii
AT shultswonmelissa sequencesof95humanmhchaplotypesrevealextremecodingvariationingenesotherthanhighlypolymorphichlaclassiandii
AT gundersonkevinl sequencesof95humanmhchaplotypesrevealextremecodingvariationingenesotherthanhighlypolymorphichlaclassiandii
AT abirachedlaurent sequencesof95humanmhchaplotypesrevealextremecodingvariationingenesotherthanhighlypolymorphichlaclassiandii
AT ronaghimostafa sequencesof95humanmhchaplotypesrevealextremecodingvariationingenesotherthanhighlypolymorphichlaclassiandii
AT parhampeter sequencesof95humanmhchaplotypesrevealextremecodingvariationingenesotherthanhighlypolymorphichlaclassiandii