Cargando…

Fosmid-based whole genome haplotyping of a HapMap trio child: evaluation of Single Individual Haplotyping techniques

Determining the underlying haplotypes of individual human genomes is an essential, but currently difficult, step toward a complete understanding of genome function. Fosmid pool-based next-generation sequencing allows genome-wide generation of 40-kb haploid DNA segments, which can be phased into cont...

Descripción completa

Detalles Bibliográficos
Autores principales: Duitama, Jorge, McEwen, Gayle K., Huebsch, Thomas, Palczewski, Stefanie, Schulz, Sabrina, Verstrepen, Kevin, Suk, Eun-Kyung, Hoehe, Margret R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3299995/
https://www.ncbi.nlm.nih.gov/pubmed/22102577
http://dx.doi.org/10.1093/nar/gkr1042
_version_ 1782226188238848000
author Duitama, Jorge
McEwen, Gayle K.
Huebsch, Thomas
Palczewski, Stefanie
Schulz, Sabrina
Verstrepen, Kevin
Suk, Eun-Kyung
Hoehe, Margret R.
author_facet Duitama, Jorge
McEwen, Gayle K.
Huebsch, Thomas
Palczewski, Stefanie
Schulz, Sabrina
Verstrepen, Kevin
Suk, Eun-Kyung
Hoehe, Margret R.
author_sort Duitama, Jorge
collection PubMed
description Determining the underlying haplotypes of individual human genomes is an essential, but currently difficult, step toward a complete understanding of genome function. Fosmid pool-based next-generation sequencing allows genome-wide generation of 40-kb haploid DNA segments, which can be phased into contiguous molecular haplotypes computationally by Single Individual Haplotyping (SIH). Many SIH algorithms have been proposed, but the accuracy of such methods has been difficult to assess due to the lack of real benchmark data. To address this problem, we generated whole genome fosmid sequence data from a HapMap trio child, NA12878, for which reliable haplotypes have already been produced. We assembled haplotypes using eight algorithms for SIH and carried out direct comparisons of their accuracy, completeness and efficiency. Our comparisons indicate that fosmid-based haplotyping can deliver highly accurate results even at low coverage and that our SIH algorithm, ReFHap, is able to efficiently produce high-quality haplotypes. We expanded the haplotypes for NA12878 by combining the current haplotypes with our fosmid-based haplotypes, producing near-to-complete new gold-standard haplotypes containing almost 98% of heterozygous SNPs. This improvement includes notable fractions of disease-related and GWA SNPs. Integrated with other molecular biological data sets, this phase information will advance the emerging field of diploid genomics.
format Online
Article
Text
id pubmed-3299995
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-32999952012-03-13 Fosmid-based whole genome haplotyping of a HapMap trio child: evaluation of Single Individual Haplotyping techniques Duitama, Jorge McEwen, Gayle K. Huebsch, Thomas Palczewski, Stefanie Schulz, Sabrina Verstrepen, Kevin Suk, Eun-Kyung Hoehe, Margret R. Nucleic Acids Res Genomics Determining the underlying haplotypes of individual human genomes is an essential, but currently difficult, step toward a complete understanding of genome function. Fosmid pool-based next-generation sequencing allows genome-wide generation of 40-kb haploid DNA segments, which can be phased into contiguous molecular haplotypes computationally by Single Individual Haplotyping (SIH). Many SIH algorithms have been proposed, but the accuracy of such methods has been difficult to assess due to the lack of real benchmark data. To address this problem, we generated whole genome fosmid sequence data from a HapMap trio child, NA12878, for which reliable haplotypes have already been produced. We assembled haplotypes using eight algorithms for SIH and carried out direct comparisons of their accuracy, completeness and efficiency. Our comparisons indicate that fosmid-based haplotyping can deliver highly accurate results even at low coverage and that our SIH algorithm, ReFHap, is able to efficiently produce high-quality haplotypes. We expanded the haplotypes for NA12878 by combining the current haplotypes with our fosmid-based haplotypes, producing near-to-complete new gold-standard haplotypes containing almost 98% of heterozygous SNPs. This improvement includes notable fractions of disease-related and GWA SNPs. Integrated with other molecular biological data sets, this phase information will advance the emerging field of diploid genomics. Oxford University Press 2012-03 2011-11-17 /pmc/articles/PMC3299995/ /pubmed/22102577 http://dx.doi.org/10.1093/nar/gkr1042 Text en © The Author(s) 2011. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Genomics
Duitama, Jorge
McEwen, Gayle K.
Huebsch, Thomas
Palczewski, Stefanie
Schulz, Sabrina
Verstrepen, Kevin
Suk, Eun-Kyung
Hoehe, Margret R.
Fosmid-based whole genome haplotyping of a HapMap trio child: evaluation of Single Individual Haplotyping techniques
title Fosmid-based whole genome haplotyping of a HapMap trio child: evaluation of Single Individual Haplotyping techniques
title_full Fosmid-based whole genome haplotyping of a HapMap trio child: evaluation of Single Individual Haplotyping techniques
title_fullStr Fosmid-based whole genome haplotyping of a HapMap trio child: evaluation of Single Individual Haplotyping techniques
title_full_unstemmed Fosmid-based whole genome haplotyping of a HapMap trio child: evaluation of Single Individual Haplotyping techniques
title_short Fosmid-based whole genome haplotyping of a HapMap trio child: evaluation of Single Individual Haplotyping techniques
title_sort fosmid-based whole genome haplotyping of a hapmap trio child: evaluation of single individual haplotyping techniques
topic Genomics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3299995/
https://www.ncbi.nlm.nih.gov/pubmed/22102577
http://dx.doi.org/10.1093/nar/gkr1042
work_keys_str_mv AT duitamajorge fosmidbasedwholegenomehaplotypingofahapmaptriochildevaluationofsingleindividualhaplotypingtechniques
AT mcewengaylek fosmidbasedwholegenomehaplotypingofahapmaptriochildevaluationofsingleindividualhaplotypingtechniques
AT huebschthomas fosmidbasedwholegenomehaplotypingofahapmaptriochildevaluationofsingleindividualhaplotypingtechniques
AT palczewskistefanie fosmidbasedwholegenomehaplotypingofahapmaptriochildevaluationofsingleindividualhaplotypingtechniques
AT schulzsabrina fosmidbasedwholegenomehaplotypingofahapmaptriochildevaluationofsingleindividualhaplotypingtechniques
AT verstrepenkevin fosmidbasedwholegenomehaplotypingofahapmaptriochildevaluationofsingleindividualhaplotypingtechniques
AT sukeunkyung fosmidbasedwholegenomehaplotypingofahapmaptriochildevaluationofsingleindividualhaplotypingtechniques
AT hoehemargretr fosmidbasedwholegenomehaplotypingofahapmaptriochildevaluationofsingleindividualhaplotypingtechniques