Cargando…

Assessing structural variation in a personal genome—towards a human reference diploid genome

BACKGROUND: Characterizing large genomic variants is essential to expanding the research and clinical applications of genome sequencing. While multiple data types and methods are available to detect these structural variants (SVs), they remain less characterized than smaller variants because of SV d...

Descripción completa

Detalles Bibliográficos
Autores principales: English, Adam C, Salerno, William J, Hampton, Oliver A, Gonzaga-Jauregui, Claudia, Ambreth, Shruthi, Ritter, Deborah I, Beck, Christine R, Davis, Caleb F, Dahdouli, Mahmoud, Ma, Singer, Carroll, Andrew, Veeraraghavan, Narayanan, Bruestle, Jeremy, Drees, Becky, Hastie, Alex, Lam, Ernest T, White, Simon, Mishra, Pamela, Wang, Min, Han, Yi, Zhang, Feng, Stankiewicz, Pawel, Wheeler, David A, Reid, Jeffrey G, Muzny, Donna M, Rogers, Jeffrey, Sabo, Aniko, Worley, Kim C, Lupski, James R, Boerwinkle, Eric, Gibbs, Richard A
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4490614/
https://www.ncbi.nlm.nih.gov/pubmed/25886820
http://dx.doi.org/10.1186/s12864-015-1479-3
_version_ 1782379539393937408
author English, Adam C
Salerno, William J
Hampton, Oliver A
Gonzaga-Jauregui, Claudia
Ambreth, Shruthi
Ritter, Deborah I
Beck, Christine R
Davis, Caleb F
Dahdouli, Mahmoud
Ma, Singer
Carroll, Andrew
Veeraraghavan, Narayanan
Bruestle, Jeremy
Drees, Becky
Hastie, Alex
Lam, Ernest T
White, Simon
Mishra, Pamela
Wang, Min
Han, Yi
Zhang, Feng
Stankiewicz, Pawel
Wheeler, David A
Reid, Jeffrey G
Muzny, Donna M
Rogers, Jeffrey
Sabo, Aniko
Worley, Kim C
Lupski, James R
Boerwinkle, Eric
Gibbs, Richard A
author_facet English, Adam C
Salerno, William J
Hampton, Oliver A
Gonzaga-Jauregui, Claudia
Ambreth, Shruthi
Ritter, Deborah I
Beck, Christine R
Davis, Caleb F
Dahdouli, Mahmoud
Ma, Singer
Carroll, Andrew
Veeraraghavan, Narayanan
Bruestle, Jeremy
Drees, Becky
Hastie, Alex
Lam, Ernest T
White, Simon
Mishra, Pamela
Wang, Min
Han, Yi
Zhang, Feng
Stankiewicz, Pawel
Wheeler, David A
Reid, Jeffrey G
Muzny, Donna M
Rogers, Jeffrey
Sabo, Aniko
Worley, Kim C
Lupski, James R
Boerwinkle, Eric
Gibbs, Richard A
author_sort English, Adam C
collection PubMed
description BACKGROUND: Characterizing large genomic variants is essential to expanding the research and clinical applications of genome sequencing. While multiple data types and methods are available to detect these structural variants (SVs), they remain less characterized than smaller variants because of SV diversity, complexity, and size. These challenges are exacerbated by the experimental and computational demands of SV analysis. Here, we characterize the SV content of a personal genome with Parliament, a publicly available consensus SV-calling infrastructure that merges multiple data types and SV detection methods. RESULTS: We demonstrate Parliament’s efficacy via integrated analyses of data from whole-genome array comparative genomic hybridization, short-read next-generation sequencing, long-read (Pacific BioSciences RSII), long-insert (Illumina Nextera), and whole-genome architecture (BioNano Irys) data from the personal genome of a single subject (HS1011). From this genome, Parliament identified 31,007 genomic loci between 100 bp and 1 Mbp that are inconsistent with the hg19 reference assembly. Of these loci, 9,777 are supported as putative SVs by hybrid local assembly, long-read PacBio data, or multi-source heuristics. These SVs span 59 Mbp of the reference genome (1.8%) and include 3,801 events identified only with long-read data. The HS1011 data and complete Parliament infrastructure, including a BAM-to-SV workflow, are available on the cloud-based service DNAnexus. CONCLUSIONS: HS1011 SV analysis reveals the limits and advantages of multiple sequencing technologies, specifically the impact of long-read SV discovery. With the full Parliament infrastructure, the HS1011 data constitute a public resource for novel SV discovery, software calibration, and personal genome structural variation analysis. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-1479-3) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4490614
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-44906142015-07-04 Assessing structural variation in a personal genome—towards a human reference diploid genome English, Adam C Salerno, William J Hampton, Oliver A Gonzaga-Jauregui, Claudia Ambreth, Shruthi Ritter, Deborah I Beck, Christine R Davis, Caleb F Dahdouli, Mahmoud Ma, Singer Carroll, Andrew Veeraraghavan, Narayanan Bruestle, Jeremy Drees, Becky Hastie, Alex Lam, Ernest T White, Simon Mishra, Pamela Wang, Min Han, Yi Zhang, Feng Stankiewicz, Pawel Wheeler, David A Reid, Jeffrey G Muzny, Donna M Rogers, Jeffrey Sabo, Aniko Worley, Kim C Lupski, James R Boerwinkle, Eric Gibbs, Richard A BMC Genomics Research Article BACKGROUND: Characterizing large genomic variants is essential to expanding the research and clinical applications of genome sequencing. While multiple data types and methods are available to detect these structural variants (SVs), they remain less characterized than smaller variants because of SV diversity, complexity, and size. These challenges are exacerbated by the experimental and computational demands of SV analysis. Here, we characterize the SV content of a personal genome with Parliament, a publicly available consensus SV-calling infrastructure that merges multiple data types and SV detection methods. RESULTS: We demonstrate Parliament’s efficacy via integrated analyses of data from whole-genome array comparative genomic hybridization, short-read next-generation sequencing, long-read (Pacific BioSciences RSII), long-insert (Illumina Nextera), and whole-genome architecture (BioNano Irys) data from the personal genome of a single subject (HS1011). From this genome, Parliament identified 31,007 genomic loci between 100 bp and 1 Mbp that are inconsistent with the hg19 reference assembly. Of these loci, 9,777 are supported as putative SVs by hybrid local assembly, long-read PacBio data, or multi-source heuristics. These SVs span 59 Mbp of the reference genome (1.8%) and include 3,801 events identified only with long-read data. The HS1011 data and complete Parliament infrastructure, including a BAM-to-SV workflow, are available on the cloud-based service DNAnexus. CONCLUSIONS: HS1011 SV analysis reveals the limits and advantages of multiple sequencing technologies, specifically the impact of long-read SV discovery. With the full Parliament infrastructure, the HS1011 data constitute a public resource for novel SV discovery, software calibration, and personal genome structural variation analysis. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-1479-3) contains supplementary material, which is available to authorized users. BioMed Central 2015-04-11 /pmc/articles/PMC4490614/ /pubmed/25886820 http://dx.doi.org/10.1186/s12864-015-1479-3 Text en © English et al. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
English, Adam C
Salerno, William J
Hampton, Oliver A
Gonzaga-Jauregui, Claudia
Ambreth, Shruthi
Ritter, Deborah I
Beck, Christine R
Davis, Caleb F
Dahdouli, Mahmoud
Ma, Singer
Carroll, Andrew
Veeraraghavan, Narayanan
Bruestle, Jeremy
Drees, Becky
Hastie, Alex
Lam, Ernest T
White, Simon
Mishra, Pamela
Wang, Min
Han, Yi
Zhang, Feng
Stankiewicz, Pawel
Wheeler, David A
Reid, Jeffrey G
Muzny, Donna M
Rogers, Jeffrey
Sabo, Aniko
Worley, Kim C
Lupski, James R
Boerwinkle, Eric
Gibbs, Richard A
Assessing structural variation in a personal genome—towards a human reference diploid genome
title Assessing structural variation in a personal genome—towards a human reference diploid genome
title_full Assessing structural variation in a personal genome—towards a human reference diploid genome
title_fullStr Assessing structural variation in a personal genome—towards a human reference diploid genome
title_full_unstemmed Assessing structural variation in a personal genome—towards a human reference diploid genome
title_short Assessing structural variation in a personal genome—towards a human reference diploid genome
title_sort assessing structural variation in a personal genome—towards a human reference diploid genome
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4490614/
https://www.ncbi.nlm.nih.gov/pubmed/25886820
http://dx.doi.org/10.1186/s12864-015-1479-3
work_keys_str_mv AT englishadamc assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT salernowilliamj assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT hamptonolivera assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT gonzagajaureguiclaudia assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT ambrethshruthi assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT ritterdeborahi assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT beckchristiner assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT daviscalebf assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT dahdoulimahmoud assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT masinger assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT carrollandrew assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT veeraraghavannarayanan assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT bruestlejeremy assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT dreesbecky assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT hastiealex assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT lamernestt assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT whitesimon assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT mishrapamela assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT wangmin assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT hanyi assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT zhangfeng assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT stankiewiczpawel assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT wheelerdavida assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT reidjeffreyg assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT muznydonnam assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT rogersjeffrey assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT saboaniko assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT worleykimc assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT lupskijamesr assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT boerwinkleeric assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome
AT gibbsricharda assessingstructuralvariationinapersonalgenometowardsahumanreferencediploidgenome