Cargando…

Mabs, a suite of tools for gene-informed genome assembly

BACKGROUND: Despite constantly improving genome sequencing methods, error-free eukaryotic genome assembly has not yet been achieved. Among other kinds of problems of eukaryotic genome assembly are so-called "haplotypic duplications", which may manifest themselves as cases of alleles being...

Descripción completa

Detalles Bibliográficos
Autor principal: Schelkunov, Mikhail I.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10548655/
https://www.ncbi.nlm.nih.gov/pubmed/37794322
http://dx.doi.org/10.1186/s12859-023-05499-3
Descripción
Sumario:BACKGROUND: Despite constantly improving genome sequencing methods, error-free eukaryotic genome assembly has not yet been achieved. Among other kinds of problems of eukaryotic genome assembly are so-called "haplotypic duplications", which may manifest themselves as cases of alleles being mistakenly assembled as paralogues. Haplotypic duplications are dangerous because they create illusions of gene family expansions and, thus, may lead scientists to incorrect conclusions about genome evolution and functioning. RESULTS: Here, I present Mabs, a suite of tools that serve as parameter optimizers of the popular genome assemblers Hifiasm and Flye. By optimizing the parameters of Hifiasm and Flye, Mabs tries to create genome assemblies with the genes assembled as accurately as possible. Tests on 6 eukaryotic genomes showed that in 6 out of 6 cases, Mabs created assemblies with more accurately assembled genes than those generated by Hifiasm and Flye when they were run with default parameters. When assemblies of Mabs, Hifiasm and Flye were postprocessed by a popular tool for haplotypic duplication removal, Purge_dups, genes were better assembled by Mabs in 5 out of 6 cases. CONCLUSIONS: Mabs is useful for making high-quality genome assemblies. It is available at https://github.com/shelkmike/Mabs SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05499-3.