Cargando…

HAT: haplotype assembly tool using short and error-prone long reads

MOTIVATION: Haplotypes are the set of alleles co-occurring on a single chromosome and inherited together to the next generation. Because a monoploid reference genome loses this co-occurrence information, it has limited use in associating phenotypes with allelic combinations of genotypes. Therefore,...

Descripción completa

Detalles Bibliográficos
Autores principales: Shirali Hossein Zade, Ramin, Urhan, Aysun, Assis de Souza, Alvaro, Singh, Akash, Abeel, Thomas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9750119/
https://www.ncbi.nlm.nih.gov/pubmed/36308461
http://dx.doi.org/10.1093/bioinformatics/btac702
_version_ 1784850183209615360
author Shirali Hossein Zade, Ramin
Urhan, Aysun
Assis de Souza, Alvaro
Singh, Akash
Abeel, Thomas
author_facet Shirali Hossein Zade, Ramin
Urhan, Aysun
Assis de Souza, Alvaro
Singh, Akash
Abeel, Thomas
author_sort Shirali Hossein Zade, Ramin
collection PubMed
description MOTIVATION: Haplotypes are the set of alleles co-occurring on a single chromosome and inherited together to the next generation. Because a monoploid reference genome loses this co-occurrence information, it has limited use in associating phenotypes with allelic combinations of genotypes. Therefore, methods to reconstruct the complete haplotypes from DNA sequencing data are crucial. Recently, several attempts have been made at haplotype reconstructions, but significant limitations remain. High-quality continuous haplotypes cannot be created reliably, particularly when there are few differences between the homologous chromosomes. RESULTS: Here, we introduce HAT, a haplotype assembly tool that exploits short and long reads along with a reference genome to reconstruct haplotypes. HAT tries to take advantage of the accuracy of short reads and the length of the long reads to reconstruct haplotypes. We tested HAT on the aneuploid yeast strain Saccharomyces pastorianus CBS1483 and multiple simulated polyploid datasets of the same strain, showing that it outperforms existing tools. AVAILABILITY AND IMPLEMENTATION: https://github.com/AbeelLab/hat/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9750119
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-97501192022-12-15 HAT: haplotype assembly tool using short and error-prone long reads Shirali Hossein Zade, Ramin Urhan, Aysun Assis de Souza, Alvaro Singh, Akash Abeel, Thomas Bioinformatics Original Paper MOTIVATION: Haplotypes are the set of alleles co-occurring on a single chromosome and inherited together to the next generation. Because a monoploid reference genome loses this co-occurrence information, it has limited use in associating phenotypes with allelic combinations of genotypes. Therefore, methods to reconstruct the complete haplotypes from DNA sequencing data are crucial. Recently, several attempts have been made at haplotype reconstructions, but significant limitations remain. High-quality continuous haplotypes cannot be created reliably, particularly when there are few differences between the homologous chromosomes. RESULTS: Here, we introduce HAT, a haplotype assembly tool that exploits short and long reads along with a reference genome to reconstruct haplotypes. HAT tries to take advantage of the accuracy of short reads and the length of the long reads to reconstruct haplotypes. We tested HAT on the aneuploid yeast strain Saccharomyces pastorianus CBS1483 and multiple simulated polyploid datasets of the same strain, showing that it outperforms existing tools. AVAILABILITY AND IMPLEMENTATION: https://github.com/AbeelLab/hat/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-10-29 /pmc/articles/PMC9750119/ /pubmed/36308461 http://dx.doi.org/10.1093/bioinformatics/btac702 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Shirali Hossein Zade, Ramin
Urhan, Aysun
Assis de Souza, Alvaro
Singh, Akash
Abeel, Thomas
HAT: haplotype assembly tool using short and error-prone long reads
title HAT: haplotype assembly tool using short and error-prone long reads
title_full HAT: haplotype assembly tool using short and error-prone long reads
title_fullStr HAT: haplotype assembly tool using short and error-prone long reads
title_full_unstemmed HAT: haplotype assembly tool using short and error-prone long reads
title_short HAT: haplotype assembly tool using short and error-prone long reads
title_sort hat: haplotype assembly tool using short and error-prone long reads
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9750119/
https://www.ncbi.nlm.nih.gov/pubmed/36308461
http://dx.doi.org/10.1093/bioinformatics/btac702
work_keys_str_mv AT shiralihosseinzaderamin hathaplotypeassemblytoolusingshortanderrorpronelongreads
AT urhanaysun hathaplotypeassemblytoolusingshortanderrorpronelongreads
AT assisdesouzaalvaro hathaplotypeassemblytoolusingshortanderrorpronelongreads
AT singhakash hathaplotypeassemblytoolusingshortanderrorpronelongreads
AT abeelthomas hathaplotypeassemblytoolusingshortanderrorpronelongreads