Cargando…
HySA: a Hybrid Structural variant Assembly approach using next-generation and single-molecule sequencing technologies
Achieving complete, accurate, and cost-effective assembly of human genomes is of great importance for realizing the promise of precision medicine. The abundance of repeats and genetic variations in human genomes and the limitations of existing sequencing technologies call for the development of nove...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory Press
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5411774/ https://www.ncbi.nlm.nih.gov/pubmed/28104618 http://dx.doi.org/10.1101/gr.214767.116 |
_version_ | 1783232863529861120 |
---|---|
author | Fan, Xian Chaisson, Mark Nakhleh, Luay Chen, Ken |
author_facet | Fan, Xian Chaisson, Mark Nakhleh, Luay Chen, Ken |
author_sort | Fan, Xian |
collection | PubMed |
description | Achieving complete, accurate, and cost-effective assembly of human genomes is of great importance for realizing the promise of precision medicine. The abundance of repeats and genetic variations in human genomes and the limitations of existing sequencing technologies call for the development of novel assembly methods that can leverage the complementary strengths of multiple technologies. We propose a Hybrid Structural variant Assembly (HySA) approach that integrates sequencing reads from next-generation sequencing and single-molecule sequencing technologies to accurately assemble and detect structural variants (SVs) in human genomes. By identifying homologous SV-containing reads from different technologies through a bipartite-graph-based clustering algorithm, our approach turns a whole genome assembly problem into a set of independent SV assembly problems, each of which can be effectively solved to enhance the assembly of structurally altered regions in human genomes. We used data generated from a haploid hydatidiform mole genome (CHM1) and a diploid human genome (NA12878) to test our approach. The result showed that, compared with existing methods, our approach had a low false discovery rate and substantially improved the detection of many types of SVs, particularly novel large insertions, small indels (10–50 bp), and short tandem repeat expansions and contractions. Our work highlights the strengths and limitations of current approaches and provides an effective solution for extending the power of existing sequencing technologies for SV discovery. |
format | Online Article Text |
id | pubmed-5411774 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Cold Spring Harbor Laboratory Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-54117742017-11-01 HySA: a Hybrid Structural variant Assembly approach using next-generation and single-molecule sequencing technologies Fan, Xian Chaisson, Mark Nakhleh, Luay Chen, Ken Genome Res Method Achieving complete, accurate, and cost-effective assembly of human genomes is of great importance for realizing the promise of precision medicine. The abundance of repeats and genetic variations in human genomes and the limitations of existing sequencing technologies call for the development of novel assembly methods that can leverage the complementary strengths of multiple technologies. We propose a Hybrid Structural variant Assembly (HySA) approach that integrates sequencing reads from next-generation sequencing and single-molecule sequencing technologies to accurately assemble and detect structural variants (SVs) in human genomes. By identifying homologous SV-containing reads from different technologies through a bipartite-graph-based clustering algorithm, our approach turns a whole genome assembly problem into a set of independent SV assembly problems, each of which can be effectively solved to enhance the assembly of structurally altered regions in human genomes. We used data generated from a haploid hydatidiform mole genome (CHM1) and a diploid human genome (NA12878) to test our approach. The result showed that, compared with existing methods, our approach had a low false discovery rate and substantially improved the detection of many types of SVs, particularly novel large insertions, small indels (10–50 bp), and short tandem repeat expansions and contractions. Our work highlights the strengths and limitations of current approaches and provides an effective solution for extending the power of existing sequencing technologies for SV discovery. Cold Spring Harbor Laboratory Press 2017-05 /pmc/articles/PMC5411774/ /pubmed/28104618 http://dx.doi.org/10.1101/gr.214767.116 Text en © 2017 Fan et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by-nc/4.0/ This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/. |
spellingShingle | Method Fan, Xian Chaisson, Mark Nakhleh, Luay Chen, Ken HySA: a Hybrid Structural variant Assembly approach using next-generation and single-molecule sequencing technologies |
title | HySA: a Hybrid Structural variant Assembly approach using next-generation and single-molecule sequencing technologies |
title_full | HySA: a Hybrid Structural variant Assembly approach using next-generation and single-molecule sequencing technologies |
title_fullStr | HySA: a Hybrid Structural variant Assembly approach using next-generation and single-molecule sequencing technologies |
title_full_unstemmed | HySA: a Hybrid Structural variant Assembly approach using next-generation and single-molecule sequencing technologies |
title_short | HySA: a Hybrid Structural variant Assembly approach using next-generation and single-molecule sequencing technologies |
title_sort | hysa: a hybrid structural variant assembly approach using next-generation and single-molecule sequencing technologies |
topic | Method |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5411774/ https://www.ncbi.nlm.nih.gov/pubmed/28104618 http://dx.doi.org/10.1101/gr.214767.116 |
work_keys_str_mv | AT fanxian hysaahybridstructuralvariantassemblyapproachusingnextgenerationandsinglemoleculesequencingtechnologies AT chaissonmark hysaahybridstructuralvariantassemblyapproachusingnextgenerationandsinglemoleculesequencingtechnologies AT nakhlehluay hysaahybridstructuralvariantassemblyapproachusingnextgenerationandsinglemoleculesequencingtechnologies AT chenken hysaahybridstructuralvariantassemblyapproachusingnextgenerationandsinglemoleculesequencingtechnologies |