Cargando…

Hap10: reconstructing accurate and long polyploid haplotypes using linked reads

BACKGROUND: Haplotype information is essential for many genetic and genomic analyses, including genotype-phenotype associations in human, animals and plants. Haplotype assembly is a method for reconstructing haplotypes from DNA sequencing reads. By the advent of new sequencing technologies, new algo...

Descripción completa

Detalles Bibliográficos
Autores principales: Majidian, Sina, Kahaei, Mohammad Hossein, de Ridder, Dick
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7302376/
https://www.ncbi.nlm.nih.gov/pubmed/32552661
http://dx.doi.org/10.1186/s12859-020-03584-5
_version_ 1783547833193857024
author Majidian, Sina
Kahaei, Mohammad Hossein
de Ridder, Dick
author_facet Majidian, Sina
Kahaei, Mohammad Hossein
de Ridder, Dick
author_sort Majidian, Sina
collection PubMed
description BACKGROUND: Haplotype information is essential for many genetic and genomic analyses, including genotype-phenotype associations in human, animals and plants. Haplotype assembly is a method for reconstructing haplotypes from DNA sequencing reads. By the advent of new sequencing technologies, new algorithms are needed to ensure long and accurate haplotypes. While a few linked-read haplotype assembly algorithms are available for diploid genomes, to the best of our knowledge, no algorithms have yet been proposed for polyploids specifically exploiting linked reads. RESULTS: The first haplotyping algorithm designed for linked reads generated from a polyploid genome is presented, built on a typical short-read haplotyping method, SDhaP. Using the input aligned reads and called variants, the haplotype-relevant information is extracted. Next, reads with the same barcodes are combined to produce molecule-specific fragments. Then, these fragments are clustered into strongly connected components which are then used as input of a haplotype assembly core in order to estimate accurate and long haplotypes. CONCLUSIONS: Hap10 is a novel algorithm for haplotype assembly of polyploid genomes using linked reads. The performance of the algorithms is evaluated in a number of simulation scenarios and its applicability is demonstrated on a real dataset of sweet potato.
format Online
Article
Text
id pubmed-7302376
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-73023762020-06-19 Hap10: reconstructing accurate and long polyploid haplotypes using linked reads Majidian, Sina Kahaei, Mohammad Hossein de Ridder, Dick BMC Bioinformatics Methodology Article BACKGROUND: Haplotype information is essential for many genetic and genomic analyses, including genotype-phenotype associations in human, animals and plants. Haplotype assembly is a method for reconstructing haplotypes from DNA sequencing reads. By the advent of new sequencing technologies, new algorithms are needed to ensure long and accurate haplotypes. While a few linked-read haplotype assembly algorithms are available for diploid genomes, to the best of our knowledge, no algorithms have yet been proposed for polyploids specifically exploiting linked reads. RESULTS: The first haplotyping algorithm designed for linked reads generated from a polyploid genome is presented, built on a typical short-read haplotyping method, SDhaP. Using the input aligned reads and called variants, the haplotype-relevant information is extracted. Next, reads with the same barcodes are combined to produce molecule-specific fragments. Then, these fragments are clustered into strongly connected components which are then used as input of a haplotype assembly core in order to estimate accurate and long haplotypes. CONCLUSIONS: Hap10 is a novel algorithm for haplotype assembly of polyploid genomes using linked reads. The performance of the algorithms is evaluated in a number of simulation scenarios and its applicability is demonstrated on a real dataset of sweet potato. BioMed Central 2020-06-18 /pmc/articles/PMC7302376/ /pubmed/32552661 http://dx.doi.org/10.1186/s12859-020-03584-5 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Majidian, Sina
Kahaei, Mohammad Hossein
de Ridder, Dick
Hap10: reconstructing accurate and long polyploid haplotypes using linked reads
title Hap10: reconstructing accurate and long polyploid haplotypes using linked reads
title_full Hap10: reconstructing accurate and long polyploid haplotypes using linked reads
title_fullStr Hap10: reconstructing accurate and long polyploid haplotypes using linked reads
title_full_unstemmed Hap10: reconstructing accurate and long polyploid haplotypes using linked reads
title_short Hap10: reconstructing accurate and long polyploid haplotypes using linked reads
title_sort hap10: reconstructing accurate and long polyploid haplotypes using linked reads
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7302376/
https://www.ncbi.nlm.nih.gov/pubmed/32552661
http://dx.doi.org/10.1186/s12859-020-03584-5
work_keys_str_mv AT majidiansina hap10reconstructingaccurateandlongpolyploidhaplotypesusinglinkedreads
AT kahaeimohammadhossein hap10reconstructingaccurateandlongpolyploidhaplotypesusinglinkedreads
AT deridderdick hap10reconstructingaccurateandlongpolyploidhaplotypesusinglinkedreads