Cargando…
MitoHiFi: a python pipeline for mitochondrial genome assembly from PacBio high fidelity reads
BACKGROUND: PacBio high fidelity (HiFi) sequencing reads are both long (15–20 kb) and highly accurate (> Q20). Because of these properties, they have revolutionised genome assembly leading to more accurate and contiguous genomes. In eukaryotes the mitochondrial genome is sequenced alongside the...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10354987/ https://www.ncbi.nlm.nih.gov/pubmed/37464285 http://dx.doi.org/10.1186/s12859-023-05385-y |
_version_ | 1785075042752659456 |
---|---|
author | Uliano-Silva, Marcela Ferreira, João Gabriel R. N. Krasheninnikova, Ksenia Formenti, Giulio Abueg, Linelle Torrance, James Myers, Eugene W. Durbin, Richard Blaxter, Mark McCarthy, Shane A. |
author_facet | Uliano-Silva, Marcela Ferreira, João Gabriel R. N. Krasheninnikova, Ksenia Formenti, Giulio Abueg, Linelle Torrance, James Myers, Eugene W. Durbin, Richard Blaxter, Mark McCarthy, Shane A. |
author_sort | Uliano-Silva, Marcela |
collection | PubMed |
description | BACKGROUND: PacBio high fidelity (HiFi) sequencing reads are both long (15–20 kb) and highly accurate (> Q20). Because of these properties, they have revolutionised genome assembly leading to more accurate and contiguous genomes. In eukaryotes the mitochondrial genome is sequenced alongside the nuclear genome often at very high coverage. A dedicated tool for mitochondrial genome assembly using HiFi reads is still missing. RESULTS: MitoHiFi was developed within the Darwin Tree of Life Project to assemble mitochondrial genomes from the HiFi reads generated for target species. The input for MitoHiFi is either the raw reads or the assembled contigs, and the tool outputs a mitochondrial genome sequence fasta file along with annotation of protein and RNA genes. Variants arising from heteroplasmy are assembled independently, and nuclear insertions of mitochondrial sequences are identified and not used in organellar genome assembly. MitoHiFi has been used to assemble 374 mitochondrial genomes (368 Metazoa and 6 Fungi species) for the Darwin Tree of Life Project, the Vertebrate Genomes Project and the Aquatic Symbiosis Genome Project. Inspection of 60 mitochondrial genomes assembled with MitoHiFi for species that already have reference sequences in public databases showed the widespread presence of previously unreported repeats. CONCLUSIONS: MitoHiFi is able to assemble mitochondrial genomes from a wide phylogenetic range of taxa from Pacbio HiFi data. MitoHiFi is written in python and is freely available on GitHub (https://github.com/marcelauliano/MitoHiFi). MitoHiFi is available with its dependencies as a Docker container on GitHub (ghcr.io/marcelauliano/mitohifi:master). SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05385-y. |
format | Online Article Text |
id | pubmed-10354987 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-103549872023-07-20 MitoHiFi: a python pipeline for mitochondrial genome assembly from PacBio high fidelity reads Uliano-Silva, Marcela Ferreira, João Gabriel R. N. Krasheninnikova, Ksenia Formenti, Giulio Abueg, Linelle Torrance, James Myers, Eugene W. Durbin, Richard Blaxter, Mark McCarthy, Shane A. BMC Bioinformatics Software BACKGROUND: PacBio high fidelity (HiFi) sequencing reads are both long (15–20 kb) and highly accurate (> Q20). Because of these properties, they have revolutionised genome assembly leading to more accurate and contiguous genomes. In eukaryotes the mitochondrial genome is sequenced alongside the nuclear genome often at very high coverage. A dedicated tool for mitochondrial genome assembly using HiFi reads is still missing. RESULTS: MitoHiFi was developed within the Darwin Tree of Life Project to assemble mitochondrial genomes from the HiFi reads generated for target species. The input for MitoHiFi is either the raw reads or the assembled contigs, and the tool outputs a mitochondrial genome sequence fasta file along with annotation of protein and RNA genes. Variants arising from heteroplasmy are assembled independently, and nuclear insertions of mitochondrial sequences are identified and not used in organellar genome assembly. MitoHiFi has been used to assemble 374 mitochondrial genomes (368 Metazoa and 6 Fungi species) for the Darwin Tree of Life Project, the Vertebrate Genomes Project and the Aquatic Symbiosis Genome Project. Inspection of 60 mitochondrial genomes assembled with MitoHiFi for species that already have reference sequences in public databases showed the widespread presence of previously unreported repeats. CONCLUSIONS: MitoHiFi is able to assemble mitochondrial genomes from a wide phylogenetic range of taxa from Pacbio HiFi data. MitoHiFi is written in python and is freely available on GitHub (https://github.com/marcelauliano/MitoHiFi). MitoHiFi is available with its dependencies as a Docker container on GitHub (ghcr.io/marcelauliano/mitohifi:master). SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05385-y. BioMed Central 2023-07-18 /pmc/articles/PMC10354987/ /pubmed/37464285 http://dx.doi.org/10.1186/s12859-023-05385-y Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Software Uliano-Silva, Marcela Ferreira, João Gabriel R. N. Krasheninnikova, Ksenia Formenti, Giulio Abueg, Linelle Torrance, James Myers, Eugene W. Durbin, Richard Blaxter, Mark McCarthy, Shane A. MitoHiFi: a python pipeline for mitochondrial genome assembly from PacBio high fidelity reads |
title | MitoHiFi: a python pipeline for mitochondrial genome assembly from PacBio high fidelity reads |
title_full | MitoHiFi: a python pipeline for mitochondrial genome assembly from PacBio high fidelity reads |
title_fullStr | MitoHiFi: a python pipeline for mitochondrial genome assembly from PacBio high fidelity reads |
title_full_unstemmed | MitoHiFi: a python pipeline for mitochondrial genome assembly from PacBio high fidelity reads |
title_short | MitoHiFi: a python pipeline for mitochondrial genome assembly from PacBio high fidelity reads |
title_sort | mitohifi: a python pipeline for mitochondrial genome assembly from pacbio high fidelity reads |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10354987/ https://www.ncbi.nlm.nih.gov/pubmed/37464285 http://dx.doi.org/10.1186/s12859-023-05385-y |
work_keys_str_mv | AT ulianosilvamarcela mitohifiapythonpipelineformitochondrialgenomeassemblyfrompacbiohighfidelityreads AT ferreirajoaogabrielrn mitohifiapythonpipelineformitochondrialgenomeassemblyfrompacbiohighfidelityreads AT krasheninnikovaksenia mitohifiapythonpipelineformitochondrialgenomeassemblyfrompacbiohighfidelityreads AT mitohifiapythonpipelineformitochondrialgenomeassemblyfrompacbiohighfidelityreads AT formentigiulio mitohifiapythonpipelineformitochondrialgenomeassemblyfrompacbiohighfidelityreads AT abueglinelle mitohifiapythonpipelineformitochondrialgenomeassemblyfrompacbiohighfidelityreads AT torrancejames mitohifiapythonpipelineformitochondrialgenomeassemblyfrompacbiohighfidelityreads AT myerseugenew mitohifiapythonpipelineformitochondrialgenomeassemblyfrompacbiohighfidelityreads AT durbinrichard mitohifiapythonpipelineformitochondrialgenomeassemblyfrompacbiohighfidelityreads AT blaxtermark mitohifiapythonpipelineformitochondrialgenomeassemblyfrompacbiohighfidelityreads AT mccarthyshanea mitohifiapythonpipelineformitochondrialgenomeassemblyfrompacbiohighfidelityreads |