Cargando…
A systematic comparison of human mitochondrial genome assembly tools
BACKGROUND: Mitochondria are the cell organelles that produce most of the chemical energy required to power the cell's biochemical reactions. Despite being a part of a eukaryotic host cell, the mitochondria contain a separate genome whose origin is linked with the endosymbiosis of a prokaryotic...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10498642/ https://www.ncbi.nlm.nih.gov/pubmed/37704952 http://dx.doi.org/10.1186/s12859-023-05445-3 |
_version_ | 1785105567016026112 |
---|---|
author | Mahar, Nirmal Singh Satyam, Rohit Sundar, Durai Gupta, Ishaan |
author_facet | Mahar, Nirmal Singh Satyam, Rohit Sundar, Durai Gupta, Ishaan |
author_sort | Mahar, Nirmal Singh |
collection | PubMed |
description | BACKGROUND: Mitochondria are the cell organelles that produce most of the chemical energy required to power the cell's biochemical reactions. Despite being a part of a eukaryotic host cell, the mitochondria contain a separate genome whose origin is linked with the endosymbiosis of a prokaryotic cell by the host cell and encode independent genomic information throughout their genomes. Mitochondrial genomes accommodate essential genes and are regularly utilized in biotechnology and phylogenetics. Various assemblers capable of generating complete mitochondrial genomes are being continuously developed. These tools often use whole-genome sequencing data as an input containing reads from the mitochondrial genome. Till now, no published work has explored the systematic comparison of all the available tools for assembling human mitochondrial genomes using short-read sequencing data. This evaluation is required to identify the best tool that can be well-optimized for small-scale projects or even national-level research. RESULTS: In this study, we have tested the mitochondrial genome assemblers for both simulated datasets and whole genome sequencing (WGS) datasets of humans. For the highest computational setting of 16 computational threads with the simulated dataset having 1000X read depth, MitoFlex took the least execution time of 69 s, and IOGA took the longest execution time of 1278 s. NOVOPlasty utilized the least computational memory of approximately 0.098 GB for the same setting, whereas IOGA utilized the highest computational memory of 11.858 GB. In the case of WGS datasets for humans, GetOrganelle and MitoFlex performed the best in capturing the SNPs information with a mean F1-score of 0.919 at the sequencing depth of 10X. MToolBox and NOVOPlasty performed consistently across all sequencing depths with a mean F1 score of 0.897 and 0.890, respectively. CONCLUSIONS: Based on the overall performance metrics and consistency in assembly quality for all sequencing data, MToolBox performed the best. However, NOVOPlasty was the second fastest tool in execution time despite being single-threaded, and it utilized the least computational resources among all the assemblers when tested on simulated datasets. Therefore, NOVOPlasty may be more practical when there is a significant sample size and a lack of computational resources. Besides, as long-read sequencing gains popularity, mitochondrial genome assemblers must be developed to use long-read sequencing data. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05445-3. |
format | Online Article Text |
id | pubmed-10498642 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-104986422023-09-14 A systematic comparison of human mitochondrial genome assembly tools Mahar, Nirmal Singh Satyam, Rohit Sundar, Durai Gupta, Ishaan BMC Bioinformatics Research BACKGROUND: Mitochondria are the cell organelles that produce most of the chemical energy required to power the cell's biochemical reactions. Despite being a part of a eukaryotic host cell, the mitochondria contain a separate genome whose origin is linked with the endosymbiosis of a prokaryotic cell by the host cell and encode independent genomic information throughout their genomes. Mitochondrial genomes accommodate essential genes and are regularly utilized in biotechnology and phylogenetics. Various assemblers capable of generating complete mitochondrial genomes are being continuously developed. These tools often use whole-genome sequencing data as an input containing reads from the mitochondrial genome. Till now, no published work has explored the systematic comparison of all the available tools for assembling human mitochondrial genomes using short-read sequencing data. This evaluation is required to identify the best tool that can be well-optimized for small-scale projects or even national-level research. RESULTS: In this study, we have tested the mitochondrial genome assemblers for both simulated datasets and whole genome sequencing (WGS) datasets of humans. For the highest computational setting of 16 computational threads with the simulated dataset having 1000X read depth, MitoFlex took the least execution time of 69 s, and IOGA took the longest execution time of 1278 s. NOVOPlasty utilized the least computational memory of approximately 0.098 GB for the same setting, whereas IOGA utilized the highest computational memory of 11.858 GB. In the case of WGS datasets for humans, GetOrganelle and MitoFlex performed the best in capturing the SNPs information with a mean F1-score of 0.919 at the sequencing depth of 10X. MToolBox and NOVOPlasty performed consistently across all sequencing depths with a mean F1 score of 0.897 and 0.890, respectively. CONCLUSIONS: Based on the overall performance metrics and consistency in assembly quality for all sequencing data, MToolBox performed the best. However, NOVOPlasty was the second fastest tool in execution time despite being single-threaded, and it utilized the least computational resources among all the assemblers when tested on simulated datasets. Therefore, NOVOPlasty may be more practical when there is a significant sample size and a lack of computational resources. Besides, as long-read sequencing gains popularity, mitochondrial genome assemblers must be developed to use long-read sequencing data. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05445-3. BioMed Central 2023-09-13 /pmc/articles/PMC10498642/ /pubmed/37704952 http://dx.doi.org/10.1186/s12859-023-05445-3 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Mahar, Nirmal Singh Satyam, Rohit Sundar, Durai Gupta, Ishaan A systematic comparison of human mitochondrial genome assembly tools |
title | A systematic comparison of human mitochondrial genome assembly tools |
title_full | A systematic comparison of human mitochondrial genome assembly tools |
title_fullStr | A systematic comparison of human mitochondrial genome assembly tools |
title_full_unstemmed | A systematic comparison of human mitochondrial genome assembly tools |
title_short | A systematic comparison of human mitochondrial genome assembly tools |
title_sort | systematic comparison of human mitochondrial genome assembly tools |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10498642/ https://www.ncbi.nlm.nih.gov/pubmed/37704952 http://dx.doi.org/10.1186/s12859-023-05445-3 |
work_keys_str_mv | AT maharnirmalsingh asystematiccomparisonofhumanmitochondrialgenomeassemblytools AT satyamrohit asystematiccomparisonofhumanmitochondrialgenomeassemblytools AT sundardurai asystematiccomparisonofhumanmitochondrialgenomeassemblytools AT guptaishaan asystematiccomparisonofhumanmitochondrialgenomeassemblytools AT maharnirmalsingh systematiccomparisonofhumanmitochondrialgenomeassemblytools AT satyamrohit systematiccomparisonofhumanmitochondrialgenomeassemblytools AT sundardurai systematiccomparisonofhumanmitochondrialgenomeassemblytools AT guptaishaan systematiccomparisonofhumanmitochondrialgenomeassemblytools |