Cargando…

A diploid assembly-based benchmark for variants in the major histocompatibility complex

Most human genomes are characterized by aligning individual reads to the reference genome, but accurate long reads and linked reads now enable us to construct accurate, phased de novo assemblies. We focus on a medically important, highly variable, 5 million base-pair (bp) region where diploid assemb...

Descripción completa

Detalles Bibliográficos
Autores principales: Chin, Chen-Shan, Wagner, Justin, Zeng, Qiandong, Garrison, Erik, Garg, Shilpa, Fungtammasan, Arkarachai, Rautiainen, Mikko, Aganezov, Sergey, Kirsche, Melanie, Zarate, Samantha, Schatz, Michael C., Xiao, Chunlin, Rowell, William J., Markello, Charles, Farek, Jesse, Sedlazeck, Fritz J., Bansal, Vikas, Yoo, Byunggil, Miller, Neil, Zhou, Xin, Carroll, Andrew, Barrio, Alvaro Martinez, Salit, Marc, Marschall, Tobias, Dilthey, Alexander T., Zook, Justin M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7508831/
https://www.ncbi.nlm.nih.gov/pubmed/32963235
http://dx.doi.org/10.1038/s41467-020-18564-9
_version_ 1783585482761830400
author Chin, Chen-Shan
Wagner, Justin
Zeng, Qiandong
Garrison, Erik
Garg, Shilpa
Fungtammasan, Arkarachai
Rautiainen, Mikko
Aganezov, Sergey
Kirsche, Melanie
Zarate, Samantha
Schatz, Michael C.
Xiao, Chunlin
Rowell, William J.
Markello, Charles
Farek, Jesse
Sedlazeck, Fritz J.
Bansal, Vikas
Yoo, Byunggil
Miller, Neil
Zhou, Xin
Carroll, Andrew
Barrio, Alvaro Martinez
Salit, Marc
Marschall, Tobias
Dilthey, Alexander T.
Zook, Justin M.
author_facet Chin, Chen-Shan
Wagner, Justin
Zeng, Qiandong
Garrison, Erik
Garg, Shilpa
Fungtammasan, Arkarachai
Rautiainen, Mikko
Aganezov, Sergey
Kirsche, Melanie
Zarate, Samantha
Schatz, Michael C.
Xiao, Chunlin
Rowell, William J.
Markello, Charles
Farek, Jesse
Sedlazeck, Fritz J.
Bansal, Vikas
Yoo, Byunggil
Miller, Neil
Zhou, Xin
Carroll, Andrew
Barrio, Alvaro Martinez
Salit, Marc
Marschall, Tobias
Dilthey, Alexander T.
Zook, Justin M.
author_sort Chin, Chen-Shan
collection PubMed
description Most human genomes are characterized by aligning individual reads to the reference genome, but accurate long reads and linked reads now enable us to construct accurate, phased de novo assemblies. We focus on a medically important, highly variable, 5 million base-pair (bp) region where diploid assembly is particularly useful - the Major Histocompatibility Complex (MHC). Here, we develop a human genome benchmark derived from a diploid assembly for the openly-consented Genome in a Bottle sample HG002. We assemble a single contig for each haplotype, align them to the reference, call phased small and structural variants, and define a small variant benchmark for the MHC, covering 94% of the MHC and 22368 variants smaller than 50 bp, 49% more variants than a mapping-based benchmark. This benchmark reliably identifies errors in mapping-based callsets, and enables performance assessment in regions with much denser, complex variation than regions covered by previous benchmarks.
format Online
Article
Text
id pubmed-7508831
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-75088312020-10-08 A diploid assembly-based benchmark for variants in the major histocompatibility complex Chin, Chen-Shan Wagner, Justin Zeng, Qiandong Garrison, Erik Garg, Shilpa Fungtammasan, Arkarachai Rautiainen, Mikko Aganezov, Sergey Kirsche, Melanie Zarate, Samantha Schatz, Michael C. Xiao, Chunlin Rowell, William J. Markello, Charles Farek, Jesse Sedlazeck, Fritz J. Bansal, Vikas Yoo, Byunggil Miller, Neil Zhou, Xin Carroll, Andrew Barrio, Alvaro Martinez Salit, Marc Marschall, Tobias Dilthey, Alexander T. Zook, Justin M. Nat Commun Article Most human genomes are characterized by aligning individual reads to the reference genome, but accurate long reads and linked reads now enable us to construct accurate, phased de novo assemblies. We focus on a medically important, highly variable, 5 million base-pair (bp) region where diploid assembly is particularly useful - the Major Histocompatibility Complex (MHC). Here, we develop a human genome benchmark derived from a diploid assembly for the openly-consented Genome in a Bottle sample HG002. We assemble a single contig for each haplotype, align them to the reference, call phased small and structural variants, and define a small variant benchmark for the MHC, covering 94% of the MHC and 22368 variants smaller than 50 bp, 49% more variants than a mapping-based benchmark. This benchmark reliably identifies errors in mapping-based callsets, and enables performance assessment in regions with much denser, complex variation than regions covered by previous benchmarks. Nature Publishing Group UK 2020-09-22 /pmc/articles/PMC7508831/ /pubmed/32963235 http://dx.doi.org/10.1038/s41467-020-18564-9 Text en © This is a U.S. government work and not under copyright protection in the U.S.; foreign copyright protection may apply 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Chin, Chen-Shan
Wagner, Justin
Zeng, Qiandong
Garrison, Erik
Garg, Shilpa
Fungtammasan, Arkarachai
Rautiainen, Mikko
Aganezov, Sergey
Kirsche, Melanie
Zarate, Samantha
Schatz, Michael C.
Xiao, Chunlin
Rowell, William J.
Markello, Charles
Farek, Jesse
Sedlazeck, Fritz J.
Bansal, Vikas
Yoo, Byunggil
Miller, Neil
Zhou, Xin
Carroll, Andrew
Barrio, Alvaro Martinez
Salit, Marc
Marschall, Tobias
Dilthey, Alexander T.
Zook, Justin M.
A diploid assembly-based benchmark for variants in the major histocompatibility complex
title A diploid assembly-based benchmark for variants in the major histocompatibility complex
title_full A diploid assembly-based benchmark for variants in the major histocompatibility complex
title_fullStr A diploid assembly-based benchmark for variants in the major histocompatibility complex
title_full_unstemmed A diploid assembly-based benchmark for variants in the major histocompatibility complex
title_short A diploid assembly-based benchmark for variants in the major histocompatibility complex
title_sort diploid assembly-based benchmark for variants in the major histocompatibility complex
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7508831/
https://www.ncbi.nlm.nih.gov/pubmed/32963235
http://dx.doi.org/10.1038/s41467-020-18564-9
work_keys_str_mv AT chinchenshan adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT wagnerjustin adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT zengqiandong adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT garrisonerik adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT gargshilpa adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT fungtammasanarkarachai adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT rautiainenmikko adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT aganezovsergey adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT kirschemelanie adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT zaratesamantha adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT schatzmichaelc adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT xiaochunlin adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT rowellwilliamj adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT markellocharles adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT farekjesse adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT sedlazeckfritzj adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT bansalvikas adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT yoobyunggil adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT millerneil adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT zhouxin adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT carrollandrew adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT barrioalvaromartinez adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT salitmarc adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT marschalltobias adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT diltheyalexandert adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT zookjustinm adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT chinchenshan diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT wagnerjustin diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT zengqiandong diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT garrisonerik diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT gargshilpa diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT fungtammasanarkarachai diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT rautiainenmikko diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT aganezovsergey diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT kirschemelanie diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT zaratesamantha diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT schatzmichaelc diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT xiaochunlin diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT rowellwilliamj diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT markellocharles diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT farekjesse diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT sedlazeckfritzj diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT bansalvikas diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT yoobyunggil diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT millerneil diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT zhouxin diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT carrollandrew diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT barrioalvaromartinez diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT salitmarc diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT marschalltobias diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT diltheyalexandert diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex
AT zookjustinm diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex