Cargando…

A systematic comparison of human mitochondrial genome assembly tools

BACKGROUND: Mitochondria are the cell organelles that produce most of the chemical energy required to power the cell's biochemical reactions. Despite being a part of a eukaryotic host cell, the mitochondria contain a separate genome whose origin is linked with the endosymbiosis of a prokaryotic...

Descripción completa

Detalles Bibliográficos
Autores principales: Mahar, Nirmal Singh, Satyam, Rohit, Sundar, Durai, Gupta, Ishaan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10498642/
https://www.ncbi.nlm.nih.gov/pubmed/37704952
http://dx.doi.org/10.1186/s12859-023-05445-3
_version_ 1785105567016026112
author Mahar, Nirmal Singh
Satyam, Rohit
Sundar, Durai
Gupta, Ishaan
author_facet Mahar, Nirmal Singh
Satyam, Rohit
Sundar, Durai
Gupta, Ishaan
author_sort Mahar, Nirmal Singh
collection PubMed
description BACKGROUND: Mitochondria are the cell organelles that produce most of the chemical energy required to power the cell's biochemical reactions. Despite being a part of a eukaryotic host cell, the mitochondria contain a separate genome whose origin is linked with the endosymbiosis of a prokaryotic cell by the host cell and encode independent genomic information throughout their genomes. Mitochondrial genomes accommodate essential genes and are regularly utilized in biotechnology and phylogenetics. Various assemblers capable of generating complete mitochondrial genomes are being continuously developed. These tools often use whole-genome sequencing data as an input containing reads from the mitochondrial genome. Till now, no published work has explored the systematic comparison of all the available tools for assembling human mitochondrial genomes using short-read sequencing data. This evaluation is required to identify the best tool that can be well-optimized for small-scale projects or even national-level research. RESULTS: In this study, we have tested the mitochondrial genome assemblers for both simulated datasets and whole genome sequencing (WGS) datasets of humans. For the highest computational setting of 16 computational threads with the simulated dataset having 1000X read depth, MitoFlex took the least execution time of 69 s, and IOGA took the longest execution time of 1278 s. NOVOPlasty utilized the least computational memory of approximately 0.098 GB for the same setting, whereas IOGA utilized the highest computational memory of 11.858 GB. In the case of WGS datasets for humans, GetOrganelle and MitoFlex performed the best in capturing the SNPs information with a mean F1-score of 0.919 at the sequencing depth of 10X. MToolBox and NOVOPlasty performed consistently across all sequencing depths with a mean F1 score of 0.897 and 0.890, respectively. CONCLUSIONS: Based on the overall performance metrics and consistency in assembly quality for all sequencing data, MToolBox performed the best. However, NOVOPlasty was the second fastest tool in execution time despite being single-threaded, and it utilized the least computational resources among all the assemblers when tested on simulated datasets. Therefore, NOVOPlasty may be more practical when there is a significant sample size and a lack of computational resources. Besides, as long-read sequencing gains popularity, mitochondrial genome assemblers must be developed to use long-read sequencing data. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05445-3.
format Online
Article
Text
id pubmed-10498642
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-104986422023-09-14 A systematic comparison of human mitochondrial genome assembly tools Mahar, Nirmal Singh Satyam, Rohit Sundar, Durai Gupta, Ishaan BMC Bioinformatics Research BACKGROUND: Mitochondria are the cell organelles that produce most of the chemical energy required to power the cell's biochemical reactions. Despite being a part of a eukaryotic host cell, the mitochondria contain a separate genome whose origin is linked with the endosymbiosis of a prokaryotic cell by the host cell and encode independent genomic information throughout their genomes. Mitochondrial genomes accommodate essential genes and are regularly utilized in biotechnology and phylogenetics. Various assemblers capable of generating complete mitochondrial genomes are being continuously developed. These tools often use whole-genome sequencing data as an input containing reads from the mitochondrial genome. Till now, no published work has explored the systematic comparison of all the available tools for assembling human mitochondrial genomes using short-read sequencing data. This evaluation is required to identify the best tool that can be well-optimized for small-scale projects or even national-level research. RESULTS: In this study, we have tested the mitochondrial genome assemblers for both simulated datasets and whole genome sequencing (WGS) datasets of humans. For the highest computational setting of 16 computational threads with the simulated dataset having 1000X read depth, MitoFlex took the least execution time of 69 s, and IOGA took the longest execution time of 1278 s. NOVOPlasty utilized the least computational memory of approximately 0.098 GB for the same setting, whereas IOGA utilized the highest computational memory of 11.858 GB. In the case of WGS datasets for humans, GetOrganelle and MitoFlex performed the best in capturing the SNPs information with a mean F1-score of 0.919 at the sequencing depth of 10X. MToolBox and NOVOPlasty performed consistently across all sequencing depths with a mean F1 score of 0.897 and 0.890, respectively. CONCLUSIONS: Based on the overall performance metrics and consistency in assembly quality for all sequencing data, MToolBox performed the best. However, NOVOPlasty was the second fastest tool in execution time despite being single-threaded, and it utilized the least computational resources among all the assemblers when tested on simulated datasets. Therefore, NOVOPlasty may be more practical when there is a significant sample size and a lack of computational resources. Besides, as long-read sequencing gains popularity, mitochondrial genome assemblers must be developed to use long-read sequencing data. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05445-3. BioMed Central 2023-09-13 /pmc/articles/PMC10498642/ /pubmed/37704952 http://dx.doi.org/10.1186/s12859-023-05445-3 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Mahar, Nirmal Singh
Satyam, Rohit
Sundar, Durai
Gupta, Ishaan
A systematic comparison of human mitochondrial genome assembly tools
title A systematic comparison of human mitochondrial genome assembly tools
title_full A systematic comparison of human mitochondrial genome assembly tools
title_fullStr A systematic comparison of human mitochondrial genome assembly tools
title_full_unstemmed A systematic comparison of human mitochondrial genome assembly tools
title_short A systematic comparison of human mitochondrial genome assembly tools
title_sort systematic comparison of human mitochondrial genome assembly tools
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10498642/
https://www.ncbi.nlm.nih.gov/pubmed/37704952
http://dx.doi.org/10.1186/s12859-023-05445-3
work_keys_str_mv AT maharnirmalsingh asystematiccomparisonofhumanmitochondrialgenomeassemblytools
AT satyamrohit asystematiccomparisonofhumanmitochondrialgenomeassemblytools
AT sundardurai asystematiccomparisonofhumanmitochondrialgenomeassemblytools
AT guptaishaan asystematiccomparisonofhumanmitochondrialgenomeassemblytools
AT maharnirmalsingh systematiccomparisonofhumanmitochondrialgenomeassemblytools
AT satyamrohit systematiccomparisonofhumanmitochondrialgenomeassemblytools
AT sundardurai systematiccomparisonofhumanmitochondrialgenomeassemblytools
AT guptaishaan systematiccomparisonofhumanmitochondrialgenomeassemblytools