Cargando…

HySA: a Hybrid Structural variant Assembly approach using next-generation and single-molecule sequencing technologies

Achieving complete, accurate, and cost-effective assembly of human genomes is of great importance for realizing the promise of precision medicine. The abundance of repeats and genetic variations in human genomes and the limitations of existing sequencing technologies call for the development of nove...

Descripción completa

Detalles Bibliográficos
Autores principales: Fan, Xian, Chaisson, Mark, Nakhleh, Luay, Chen, Ken
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5411774/
https://www.ncbi.nlm.nih.gov/pubmed/28104618
http://dx.doi.org/10.1101/gr.214767.116
_version_ 1783232863529861120
author Fan, Xian
Chaisson, Mark
Nakhleh, Luay
Chen, Ken
author_facet Fan, Xian
Chaisson, Mark
Nakhleh, Luay
Chen, Ken
author_sort Fan, Xian
collection PubMed
description Achieving complete, accurate, and cost-effective assembly of human genomes is of great importance for realizing the promise of precision medicine. The abundance of repeats and genetic variations in human genomes and the limitations of existing sequencing technologies call for the development of novel assembly methods that can leverage the complementary strengths of multiple technologies. We propose a Hybrid Structural variant Assembly (HySA) approach that integrates sequencing reads from next-generation sequencing and single-molecule sequencing technologies to accurately assemble and detect structural variants (SVs) in human genomes. By identifying homologous SV-containing reads from different technologies through a bipartite-graph-based clustering algorithm, our approach turns a whole genome assembly problem into a set of independent SV assembly problems, each of which can be effectively solved to enhance the assembly of structurally altered regions in human genomes. We used data generated from a haploid hydatidiform mole genome (CHM1) and a diploid human genome (NA12878) to test our approach. The result showed that, compared with existing methods, our approach had a low false discovery rate and substantially improved the detection of many types of SVs, particularly novel large insertions, small indels (10–50 bp), and short tandem repeat expansions and contractions. Our work highlights the strengths and limitations of current approaches and provides an effective solution for extending the power of existing sequencing technologies for SV discovery.
format Online
Article
Text
id pubmed-5411774
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-54117742017-11-01 HySA: a Hybrid Structural variant Assembly approach using next-generation and single-molecule sequencing technologies Fan, Xian Chaisson, Mark Nakhleh, Luay Chen, Ken Genome Res Method Achieving complete, accurate, and cost-effective assembly of human genomes is of great importance for realizing the promise of precision medicine. The abundance of repeats and genetic variations in human genomes and the limitations of existing sequencing technologies call for the development of novel assembly methods that can leverage the complementary strengths of multiple technologies. We propose a Hybrid Structural variant Assembly (HySA) approach that integrates sequencing reads from next-generation sequencing and single-molecule sequencing technologies to accurately assemble and detect structural variants (SVs) in human genomes. By identifying homologous SV-containing reads from different technologies through a bipartite-graph-based clustering algorithm, our approach turns a whole genome assembly problem into a set of independent SV assembly problems, each of which can be effectively solved to enhance the assembly of structurally altered regions in human genomes. We used data generated from a haploid hydatidiform mole genome (CHM1) and a diploid human genome (NA12878) to test our approach. The result showed that, compared with existing methods, our approach had a low false discovery rate and substantially improved the detection of many types of SVs, particularly novel large insertions, small indels (10–50 bp), and short tandem repeat expansions and contractions. Our work highlights the strengths and limitations of current approaches and provides an effective solution for extending the power of existing sequencing technologies for SV discovery. Cold Spring Harbor Laboratory Press 2017-05 /pmc/articles/PMC5411774/ /pubmed/28104618 http://dx.doi.org/10.1101/gr.214767.116 Text en © 2017 Fan et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by-nc/4.0/ This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.
spellingShingle Method
Fan, Xian
Chaisson, Mark
Nakhleh, Luay
Chen, Ken
HySA: a Hybrid Structural variant Assembly approach using next-generation and single-molecule sequencing technologies
title HySA: a Hybrid Structural variant Assembly approach using next-generation and single-molecule sequencing technologies
title_full HySA: a Hybrid Structural variant Assembly approach using next-generation and single-molecule sequencing technologies
title_fullStr HySA: a Hybrid Structural variant Assembly approach using next-generation and single-molecule sequencing technologies
title_full_unstemmed HySA: a Hybrid Structural variant Assembly approach using next-generation and single-molecule sequencing technologies
title_short HySA: a Hybrid Structural variant Assembly approach using next-generation and single-molecule sequencing technologies
title_sort hysa: a hybrid structural variant assembly approach using next-generation and single-molecule sequencing technologies
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5411774/
https://www.ncbi.nlm.nih.gov/pubmed/28104618
http://dx.doi.org/10.1101/gr.214767.116
work_keys_str_mv AT fanxian hysaahybridstructuralvariantassemblyapproachusingnextgenerationandsinglemoleculesequencingtechnologies
AT chaissonmark hysaahybridstructuralvariantassemblyapproachusingnextgenerationandsinglemoleculesequencingtechnologies
AT nakhlehluay hysaahybridstructuralvariantassemblyapproachusingnextgenerationandsinglemoleculesequencingtechnologies
AT chenken hysaahybridstructuralvariantassemblyapproachusingnextgenerationandsinglemoleculesequencingtechnologies