Cargando…
Hap10: reconstructing accurate and long polyploid haplotypes using linked reads
BACKGROUND: Haplotype information is essential for many genetic and genomic analyses, including genotype-phenotype associations in human, animals and plants. Haplotype assembly is a method for reconstructing haplotypes from DNA sequencing reads. By the advent of new sequencing technologies, new algo...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7302376/ https://www.ncbi.nlm.nih.gov/pubmed/32552661 http://dx.doi.org/10.1186/s12859-020-03584-5 |
_version_ | 1783547833193857024 |
---|---|
author | Majidian, Sina Kahaei, Mohammad Hossein de Ridder, Dick |
author_facet | Majidian, Sina Kahaei, Mohammad Hossein de Ridder, Dick |
author_sort | Majidian, Sina |
collection | PubMed |
description | BACKGROUND: Haplotype information is essential for many genetic and genomic analyses, including genotype-phenotype associations in human, animals and plants. Haplotype assembly is a method for reconstructing haplotypes from DNA sequencing reads. By the advent of new sequencing technologies, new algorithms are needed to ensure long and accurate haplotypes. While a few linked-read haplotype assembly algorithms are available for diploid genomes, to the best of our knowledge, no algorithms have yet been proposed for polyploids specifically exploiting linked reads. RESULTS: The first haplotyping algorithm designed for linked reads generated from a polyploid genome is presented, built on a typical short-read haplotyping method, SDhaP. Using the input aligned reads and called variants, the haplotype-relevant information is extracted. Next, reads with the same barcodes are combined to produce molecule-specific fragments. Then, these fragments are clustered into strongly connected components which are then used as input of a haplotype assembly core in order to estimate accurate and long haplotypes. CONCLUSIONS: Hap10 is a novel algorithm for haplotype assembly of polyploid genomes using linked reads. The performance of the algorithms is evaluated in a number of simulation scenarios and its applicability is demonstrated on a real dataset of sweet potato. |
format | Online Article Text |
id | pubmed-7302376 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-73023762020-06-19 Hap10: reconstructing accurate and long polyploid haplotypes using linked reads Majidian, Sina Kahaei, Mohammad Hossein de Ridder, Dick BMC Bioinformatics Methodology Article BACKGROUND: Haplotype information is essential for many genetic and genomic analyses, including genotype-phenotype associations in human, animals and plants. Haplotype assembly is a method for reconstructing haplotypes from DNA sequencing reads. By the advent of new sequencing technologies, new algorithms are needed to ensure long and accurate haplotypes. While a few linked-read haplotype assembly algorithms are available for diploid genomes, to the best of our knowledge, no algorithms have yet been proposed for polyploids specifically exploiting linked reads. RESULTS: The first haplotyping algorithm designed for linked reads generated from a polyploid genome is presented, built on a typical short-read haplotyping method, SDhaP. Using the input aligned reads and called variants, the haplotype-relevant information is extracted. Next, reads with the same barcodes are combined to produce molecule-specific fragments. Then, these fragments are clustered into strongly connected components which are then used as input of a haplotype assembly core in order to estimate accurate and long haplotypes. CONCLUSIONS: Hap10 is a novel algorithm for haplotype assembly of polyploid genomes using linked reads. The performance of the algorithms is evaluated in a number of simulation scenarios and its applicability is demonstrated on a real dataset of sweet potato. BioMed Central 2020-06-18 /pmc/articles/PMC7302376/ /pubmed/32552661 http://dx.doi.org/10.1186/s12859-020-03584-5 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Methodology Article Majidian, Sina Kahaei, Mohammad Hossein de Ridder, Dick Hap10: reconstructing accurate and long polyploid haplotypes using linked reads |
title | Hap10: reconstructing accurate and long polyploid haplotypes using linked reads |
title_full | Hap10: reconstructing accurate and long polyploid haplotypes using linked reads |
title_fullStr | Hap10: reconstructing accurate and long polyploid haplotypes using linked reads |
title_full_unstemmed | Hap10: reconstructing accurate and long polyploid haplotypes using linked reads |
title_short | Hap10: reconstructing accurate and long polyploid haplotypes using linked reads |
title_sort | hap10: reconstructing accurate and long polyploid haplotypes using linked reads |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7302376/ https://www.ncbi.nlm.nih.gov/pubmed/32552661 http://dx.doi.org/10.1186/s12859-020-03584-5 |
work_keys_str_mv | AT majidiansina hap10reconstructingaccurateandlongpolyploidhaplotypesusinglinkedreads AT kahaeimohammadhossein hap10reconstructingaccurateandlongpolyploidhaplotypesusinglinkedreads AT deridderdick hap10reconstructingaccurateandlongpolyploidhaplotypesusinglinkedreads |