Cargando…
HAT: haplotype assembly tool using short and error-prone long reads
MOTIVATION: Haplotypes are the set of alleles co-occurring on a single chromosome and inherited together to the next generation. Because a monoploid reference genome loses this co-occurrence information, it has limited use in associating phenotypes with allelic combinations of genotypes. Therefore,...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9750119/ https://www.ncbi.nlm.nih.gov/pubmed/36308461 http://dx.doi.org/10.1093/bioinformatics/btac702 |
_version_ | 1784850183209615360 |
---|---|
author | Shirali Hossein Zade, Ramin Urhan, Aysun Assis de Souza, Alvaro Singh, Akash Abeel, Thomas |
author_facet | Shirali Hossein Zade, Ramin Urhan, Aysun Assis de Souza, Alvaro Singh, Akash Abeel, Thomas |
author_sort | Shirali Hossein Zade, Ramin |
collection | PubMed |
description | MOTIVATION: Haplotypes are the set of alleles co-occurring on a single chromosome and inherited together to the next generation. Because a monoploid reference genome loses this co-occurrence information, it has limited use in associating phenotypes with allelic combinations of genotypes. Therefore, methods to reconstruct the complete haplotypes from DNA sequencing data are crucial. Recently, several attempts have been made at haplotype reconstructions, but significant limitations remain. High-quality continuous haplotypes cannot be created reliably, particularly when there are few differences between the homologous chromosomes. RESULTS: Here, we introduce HAT, a haplotype assembly tool that exploits short and long reads along with a reference genome to reconstruct haplotypes. HAT tries to take advantage of the accuracy of short reads and the length of the long reads to reconstruct haplotypes. We tested HAT on the aneuploid yeast strain Saccharomyces pastorianus CBS1483 and multiple simulated polyploid datasets of the same strain, showing that it outperforms existing tools. AVAILABILITY AND IMPLEMENTATION: https://github.com/AbeelLab/hat/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-9750119 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-97501192022-12-15 HAT: haplotype assembly tool using short and error-prone long reads Shirali Hossein Zade, Ramin Urhan, Aysun Assis de Souza, Alvaro Singh, Akash Abeel, Thomas Bioinformatics Original Paper MOTIVATION: Haplotypes are the set of alleles co-occurring on a single chromosome and inherited together to the next generation. Because a monoploid reference genome loses this co-occurrence information, it has limited use in associating phenotypes with allelic combinations of genotypes. Therefore, methods to reconstruct the complete haplotypes from DNA sequencing data are crucial. Recently, several attempts have been made at haplotype reconstructions, but significant limitations remain. High-quality continuous haplotypes cannot be created reliably, particularly when there are few differences between the homologous chromosomes. RESULTS: Here, we introduce HAT, a haplotype assembly tool that exploits short and long reads along with a reference genome to reconstruct haplotypes. HAT tries to take advantage of the accuracy of short reads and the length of the long reads to reconstruct haplotypes. We tested HAT on the aneuploid yeast strain Saccharomyces pastorianus CBS1483 and multiple simulated polyploid datasets of the same strain, showing that it outperforms existing tools. AVAILABILITY AND IMPLEMENTATION: https://github.com/AbeelLab/hat/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-10-29 /pmc/articles/PMC9750119/ /pubmed/36308461 http://dx.doi.org/10.1093/bioinformatics/btac702 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Paper Shirali Hossein Zade, Ramin Urhan, Aysun Assis de Souza, Alvaro Singh, Akash Abeel, Thomas HAT: haplotype assembly tool using short and error-prone long reads |
title | HAT: haplotype assembly tool using short and error-prone long reads |
title_full | HAT: haplotype assembly tool using short and error-prone long reads |
title_fullStr | HAT: haplotype assembly tool using short and error-prone long reads |
title_full_unstemmed | HAT: haplotype assembly tool using short and error-prone long reads |
title_short | HAT: haplotype assembly tool using short and error-prone long reads |
title_sort | hat: haplotype assembly tool using short and error-prone long reads |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9750119/ https://www.ncbi.nlm.nih.gov/pubmed/36308461 http://dx.doi.org/10.1093/bioinformatics/btac702 |
work_keys_str_mv | AT shiralihosseinzaderamin hathaplotypeassemblytoolusingshortanderrorpronelongreads AT urhanaysun hathaplotypeassemblytoolusingshortanderrorpronelongreads AT assisdesouzaalvaro hathaplotypeassemblytoolusingshortanderrorpronelongreads AT singhakash hathaplotypeassemblytoolusingshortanderrorpronelongreads AT abeelthomas hathaplotypeassemblytoolusingshortanderrorpronelongreads |