Cargando…

Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly

The human reference genome assembly plays a central role in nearly all aspects of today's basic and clinical research. GRCh38 is the first coordinate-changing assembly update since 2009; it reflects the resolution of roughly 1000 issues and encompasses modifications ranging from thousands of si...

Descripción completa

Detalles Bibliográficos
Autores principales: Schneider, Valerie A., Graves-Lindsay, Tina, Howe, Kerstin, Bouk, Nathan, Chen, Hsiu-Chuan, Kitts, Paul A., Murphy, Terence D., Pruitt, Kim D., Thibaud-Nissen, Françoise, Albracht, Derek, Fulton, Robert S., Kremitzki, Milinn, Magrini, Vincent, Markovic, Chris, McGrath, Sean, Steinberg, Karyn Meltz, Auger, Kate, Chow, William, Collins, Joanna, Harden, Glenn, Hubbard, Timothy, Pelan, Sarah, Simpson, Jared T., Threadgold, Glen, Torrance, James, Wood, Jonathan M., Clarke, Laura, Koren, Sergey, Boitano, Matthew, Peluso, Paul, Li, Heng, Chin, Chen-Shan, Phillippy, Adam M., Durbin, Richard, Wilson, Richard K., Flicek, Paul, Eichler, Evan E., Church, Deanna M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5411779/
https://www.ncbi.nlm.nih.gov/pubmed/28396521
http://dx.doi.org/10.1101/gr.213611.116
_version_ 1783232864777666560
author Schneider, Valerie A.
Graves-Lindsay, Tina
Howe, Kerstin
Bouk, Nathan
Chen, Hsiu-Chuan
Kitts, Paul A.
Murphy, Terence D.
Pruitt, Kim D.
Thibaud-Nissen, Françoise
Albracht, Derek
Fulton, Robert S.
Kremitzki, Milinn
Magrini, Vincent
Markovic, Chris
McGrath, Sean
Steinberg, Karyn Meltz
Auger, Kate
Chow, William
Collins, Joanna
Harden, Glenn
Hubbard, Timothy
Pelan, Sarah
Simpson, Jared T.
Threadgold, Glen
Torrance, James
Wood, Jonathan M.
Clarke, Laura
Koren, Sergey
Boitano, Matthew
Peluso, Paul
Li, Heng
Chin, Chen-Shan
Phillippy, Adam M.
Durbin, Richard
Wilson, Richard K.
Flicek, Paul
Eichler, Evan E.
Church, Deanna M.
author_facet Schneider, Valerie A.
Graves-Lindsay, Tina
Howe, Kerstin
Bouk, Nathan
Chen, Hsiu-Chuan
Kitts, Paul A.
Murphy, Terence D.
Pruitt, Kim D.
Thibaud-Nissen, Françoise
Albracht, Derek
Fulton, Robert S.
Kremitzki, Milinn
Magrini, Vincent
Markovic, Chris
McGrath, Sean
Steinberg, Karyn Meltz
Auger, Kate
Chow, William
Collins, Joanna
Harden, Glenn
Hubbard, Timothy
Pelan, Sarah
Simpson, Jared T.
Threadgold, Glen
Torrance, James
Wood, Jonathan M.
Clarke, Laura
Koren, Sergey
Boitano, Matthew
Peluso, Paul
Li, Heng
Chin, Chen-Shan
Phillippy, Adam M.
Durbin, Richard
Wilson, Richard K.
Flicek, Paul
Eichler, Evan E.
Church, Deanna M.
author_sort Schneider, Valerie A.
collection PubMed
description The human reference genome assembly plays a central role in nearly all aspects of today's basic and clinical research. GRCh38 is the first coordinate-changing assembly update since 2009; it reflects the resolution of roughly 1000 issues and encompasses modifications ranging from thousands of single base changes to megabase-scale path reorganizations, gap closures, and localization of previously orphaned sequences. We developed a new approach to sequence generation for targeted base updates and used data from new genome mapping technologies and single haplotype resources to identify and resolve larger assembly issues. For the first time, the reference assembly contains sequence-based representations for the centromeres. We also expanded the number of alternate loci to create a reference that provides a more robust representation of human population variation. We demonstrate that the updates render the reference an improved annotation substrate, alter read alignments in unchanged regions, and impact variant interpretation at clinically relevant loci. We additionally evaluated a collection of new de novo long-read haploid assemblies and conclude that although the new assemblies compare favorably to the reference with respect to continuity, error rate, and gene completeness, the reference still provides the best representation for complex genomic regions and coding sequences. We assert that the collected updates in GRCh38 make the newer assembly a more robust substrate for comprehensive analyses that will promote our understanding of human biology and advance our efforts to improve health.
format Online
Article
Text
id pubmed-5411779
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-54117792017-05-16 Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly Schneider, Valerie A. Graves-Lindsay, Tina Howe, Kerstin Bouk, Nathan Chen, Hsiu-Chuan Kitts, Paul A. Murphy, Terence D. Pruitt, Kim D. Thibaud-Nissen, Françoise Albracht, Derek Fulton, Robert S. Kremitzki, Milinn Magrini, Vincent Markovic, Chris McGrath, Sean Steinberg, Karyn Meltz Auger, Kate Chow, William Collins, Joanna Harden, Glenn Hubbard, Timothy Pelan, Sarah Simpson, Jared T. Threadgold, Glen Torrance, James Wood, Jonathan M. Clarke, Laura Koren, Sergey Boitano, Matthew Peluso, Paul Li, Heng Chin, Chen-Shan Phillippy, Adam M. Durbin, Richard Wilson, Richard K. Flicek, Paul Eichler, Evan E. Church, Deanna M. Genome Res Resource The human reference genome assembly plays a central role in nearly all aspects of today's basic and clinical research. GRCh38 is the first coordinate-changing assembly update since 2009; it reflects the resolution of roughly 1000 issues and encompasses modifications ranging from thousands of single base changes to megabase-scale path reorganizations, gap closures, and localization of previously orphaned sequences. We developed a new approach to sequence generation for targeted base updates and used data from new genome mapping technologies and single haplotype resources to identify and resolve larger assembly issues. For the first time, the reference assembly contains sequence-based representations for the centromeres. We also expanded the number of alternate loci to create a reference that provides a more robust representation of human population variation. We demonstrate that the updates render the reference an improved annotation substrate, alter read alignments in unchanged regions, and impact variant interpretation at clinically relevant loci. We additionally evaluated a collection of new de novo long-read haploid assemblies and conclude that although the new assemblies compare favorably to the reference with respect to continuity, error rate, and gene completeness, the reference still provides the best representation for complex genomic regions and coding sequences. We assert that the collected updates in GRCh38 make the newer assembly a more robust substrate for comprehensive analyses that will promote our understanding of human biology and advance our efforts to improve health. Cold Spring Harbor Laboratory Press 2017-05 /pmc/articles/PMC5411779/ /pubmed/28396521 http://dx.doi.org/10.1101/gr.213611.116 Text en © 2017 Schneider et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by/4.0/ This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.
spellingShingle Resource
Schneider, Valerie A.
Graves-Lindsay, Tina
Howe, Kerstin
Bouk, Nathan
Chen, Hsiu-Chuan
Kitts, Paul A.
Murphy, Terence D.
Pruitt, Kim D.
Thibaud-Nissen, Françoise
Albracht, Derek
Fulton, Robert S.
Kremitzki, Milinn
Magrini, Vincent
Markovic, Chris
McGrath, Sean
Steinberg, Karyn Meltz
Auger, Kate
Chow, William
Collins, Joanna
Harden, Glenn
Hubbard, Timothy
Pelan, Sarah
Simpson, Jared T.
Threadgold, Glen
Torrance, James
Wood, Jonathan M.
Clarke, Laura
Koren, Sergey
Boitano, Matthew
Peluso, Paul
Li, Heng
Chin, Chen-Shan
Phillippy, Adam M.
Durbin, Richard
Wilson, Richard K.
Flicek, Paul
Eichler, Evan E.
Church, Deanna M.
Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly
title Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly
title_full Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly
title_fullStr Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly
title_full_unstemmed Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly
title_short Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly
title_sort evaluation of grch38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly
topic Resource
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5411779/
https://www.ncbi.nlm.nih.gov/pubmed/28396521
http://dx.doi.org/10.1101/gr.213611.116
work_keys_str_mv AT schneidervaleriea evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT graveslindsaytina evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT howekerstin evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT bouknathan evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT chenhsiuchuan evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT kittspaula evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT murphyterenced evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT pruittkimd evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT thibaudnissenfrancoise evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT albrachtderek evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT fultonroberts evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT kremitzkimilinn evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT magrinivincent evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT markovicchris evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT mcgrathsean evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT steinbergkarynmeltz evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT augerkate evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT chowwilliam evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT collinsjoanna evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT hardenglenn evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT hubbardtimothy evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT pelansarah evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT simpsonjaredt evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT threadgoldglen evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT torrancejames evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT woodjonathanm evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT clarkelaura evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT korensergey evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT boitanomatthew evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT pelusopaul evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT liheng evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT chinchenshan evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT phillippyadamm evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT durbinrichard evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT wilsonrichardk evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT flicekpaul evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT eichlerevane evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly
AT churchdeannam evaluationofgrch38anddenovohaploidgenomeassembliesdemonstratestheenduringqualityofthereferenceassembly