Cargando…
A diploid assembly-based benchmark for variants in the major histocompatibility complex
Most human genomes are characterized by aligning individual reads to the reference genome, but accurate long reads and linked reads now enable us to construct accurate, phased de novo assemblies. We focus on a medically important, highly variable, 5 million base-pair (bp) region where diploid assemb...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7508831/ https://www.ncbi.nlm.nih.gov/pubmed/32963235 http://dx.doi.org/10.1038/s41467-020-18564-9 |
_version_ | 1783585482761830400 |
---|---|
author | Chin, Chen-Shan Wagner, Justin Zeng, Qiandong Garrison, Erik Garg, Shilpa Fungtammasan, Arkarachai Rautiainen, Mikko Aganezov, Sergey Kirsche, Melanie Zarate, Samantha Schatz, Michael C. Xiao, Chunlin Rowell, William J. Markello, Charles Farek, Jesse Sedlazeck, Fritz J. Bansal, Vikas Yoo, Byunggil Miller, Neil Zhou, Xin Carroll, Andrew Barrio, Alvaro Martinez Salit, Marc Marschall, Tobias Dilthey, Alexander T. Zook, Justin M. |
author_facet | Chin, Chen-Shan Wagner, Justin Zeng, Qiandong Garrison, Erik Garg, Shilpa Fungtammasan, Arkarachai Rautiainen, Mikko Aganezov, Sergey Kirsche, Melanie Zarate, Samantha Schatz, Michael C. Xiao, Chunlin Rowell, William J. Markello, Charles Farek, Jesse Sedlazeck, Fritz J. Bansal, Vikas Yoo, Byunggil Miller, Neil Zhou, Xin Carroll, Andrew Barrio, Alvaro Martinez Salit, Marc Marschall, Tobias Dilthey, Alexander T. Zook, Justin M. |
author_sort | Chin, Chen-Shan |
collection | PubMed |
description | Most human genomes are characterized by aligning individual reads to the reference genome, but accurate long reads and linked reads now enable us to construct accurate, phased de novo assemblies. We focus on a medically important, highly variable, 5 million base-pair (bp) region where diploid assembly is particularly useful - the Major Histocompatibility Complex (MHC). Here, we develop a human genome benchmark derived from a diploid assembly for the openly-consented Genome in a Bottle sample HG002. We assemble a single contig for each haplotype, align them to the reference, call phased small and structural variants, and define a small variant benchmark for the MHC, covering 94% of the MHC and 22368 variants smaller than 50 bp, 49% more variants than a mapping-based benchmark. This benchmark reliably identifies errors in mapping-based callsets, and enables performance assessment in regions with much denser, complex variation than regions covered by previous benchmarks. |
format | Online Article Text |
id | pubmed-7508831 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-75088312020-10-08 A diploid assembly-based benchmark for variants in the major histocompatibility complex Chin, Chen-Shan Wagner, Justin Zeng, Qiandong Garrison, Erik Garg, Shilpa Fungtammasan, Arkarachai Rautiainen, Mikko Aganezov, Sergey Kirsche, Melanie Zarate, Samantha Schatz, Michael C. Xiao, Chunlin Rowell, William J. Markello, Charles Farek, Jesse Sedlazeck, Fritz J. Bansal, Vikas Yoo, Byunggil Miller, Neil Zhou, Xin Carroll, Andrew Barrio, Alvaro Martinez Salit, Marc Marschall, Tobias Dilthey, Alexander T. Zook, Justin M. Nat Commun Article Most human genomes are characterized by aligning individual reads to the reference genome, but accurate long reads and linked reads now enable us to construct accurate, phased de novo assemblies. We focus on a medically important, highly variable, 5 million base-pair (bp) region where diploid assembly is particularly useful - the Major Histocompatibility Complex (MHC). Here, we develop a human genome benchmark derived from a diploid assembly for the openly-consented Genome in a Bottle sample HG002. We assemble a single contig for each haplotype, align them to the reference, call phased small and structural variants, and define a small variant benchmark for the MHC, covering 94% of the MHC and 22368 variants smaller than 50 bp, 49% more variants than a mapping-based benchmark. This benchmark reliably identifies errors in mapping-based callsets, and enables performance assessment in regions with much denser, complex variation than regions covered by previous benchmarks. Nature Publishing Group UK 2020-09-22 /pmc/articles/PMC7508831/ /pubmed/32963235 http://dx.doi.org/10.1038/s41467-020-18564-9 Text en © This is a U.S. government work and not under copyright protection in the U.S.; foreign copyright protection may apply 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Chin, Chen-Shan Wagner, Justin Zeng, Qiandong Garrison, Erik Garg, Shilpa Fungtammasan, Arkarachai Rautiainen, Mikko Aganezov, Sergey Kirsche, Melanie Zarate, Samantha Schatz, Michael C. Xiao, Chunlin Rowell, William J. Markello, Charles Farek, Jesse Sedlazeck, Fritz J. Bansal, Vikas Yoo, Byunggil Miller, Neil Zhou, Xin Carroll, Andrew Barrio, Alvaro Martinez Salit, Marc Marschall, Tobias Dilthey, Alexander T. Zook, Justin M. A diploid assembly-based benchmark for variants in the major histocompatibility complex |
title | A diploid assembly-based benchmark for variants in the major histocompatibility complex |
title_full | A diploid assembly-based benchmark for variants in the major histocompatibility complex |
title_fullStr | A diploid assembly-based benchmark for variants in the major histocompatibility complex |
title_full_unstemmed | A diploid assembly-based benchmark for variants in the major histocompatibility complex |
title_short | A diploid assembly-based benchmark for variants in the major histocompatibility complex |
title_sort | diploid assembly-based benchmark for variants in the major histocompatibility complex |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7508831/ https://www.ncbi.nlm.nih.gov/pubmed/32963235 http://dx.doi.org/10.1038/s41467-020-18564-9 |
work_keys_str_mv | AT chinchenshan adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT wagnerjustin adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT zengqiandong adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT garrisonerik adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT gargshilpa adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT fungtammasanarkarachai adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT rautiainenmikko adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT aganezovsergey adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT kirschemelanie adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT zaratesamantha adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT schatzmichaelc adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT xiaochunlin adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT rowellwilliamj adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT markellocharles adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT farekjesse adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT sedlazeckfritzj adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT bansalvikas adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT yoobyunggil adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT millerneil adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT zhouxin adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT carrollandrew adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT barrioalvaromartinez adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT salitmarc adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT marschalltobias adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT diltheyalexandert adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT zookjustinm adiploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT chinchenshan diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT wagnerjustin diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT zengqiandong diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT garrisonerik diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT gargshilpa diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT fungtammasanarkarachai diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT rautiainenmikko diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT aganezovsergey diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT kirschemelanie diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT zaratesamantha diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT schatzmichaelc diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT xiaochunlin diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT rowellwilliamj diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT markellocharles diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT farekjesse diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT sedlazeckfritzj diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT bansalvikas diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT yoobyunggil diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT millerneil diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT zhouxin diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT carrollandrew diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT barrioalvaromartinez diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT salitmarc diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT marschalltobias diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT diltheyalexandert diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex AT zookjustinm diploidassemblybasedbenchmarkforvariantsinthemajorhistocompatibilitycomplex |