Cargando…

MitoHiFi: a python pipeline for mitochondrial genome assembly from PacBio high fidelity reads

BACKGROUND:  PacBio high fidelity (HiFi) sequencing reads are both long (15–20 kb) and highly accurate (> Q20). Because of these properties, they have revolutionised genome assembly leading to more accurate and contiguous genomes. In eukaryotes the mitochondrial genome is sequenced alongside the...

Descripción completa

Detalles Bibliográficos
Autores principales: Uliano-Silva, Marcela, Ferreira, João Gabriel R. N., Krasheninnikova, Ksenia, Formenti, Giulio, Abueg, Linelle, Torrance, James, Myers, Eugene W., Durbin, Richard, Blaxter, Mark, McCarthy, Shane A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10354987/
https://www.ncbi.nlm.nih.gov/pubmed/37464285
http://dx.doi.org/10.1186/s12859-023-05385-y
_version_ 1785075042752659456
author Uliano-Silva, Marcela
Ferreira, João Gabriel R. N.
Krasheninnikova, Ksenia
Formenti, Giulio
Abueg, Linelle
Torrance, James
Myers, Eugene W.
Durbin, Richard
Blaxter, Mark
McCarthy, Shane A.
author_facet Uliano-Silva, Marcela
Ferreira, João Gabriel R. N.
Krasheninnikova, Ksenia
Formenti, Giulio
Abueg, Linelle
Torrance, James
Myers, Eugene W.
Durbin, Richard
Blaxter, Mark
McCarthy, Shane A.
author_sort Uliano-Silva, Marcela
collection PubMed
description BACKGROUND:  PacBio high fidelity (HiFi) sequencing reads are both long (15–20 kb) and highly accurate (> Q20). Because of these properties, they have revolutionised genome assembly leading to more accurate and contiguous genomes. In eukaryotes the mitochondrial genome is sequenced alongside the nuclear genome often at very high coverage. A dedicated tool for mitochondrial genome assembly using HiFi reads is still missing. RESULTS:  MitoHiFi was developed within the Darwin Tree of Life Project to assemble mitochondrial genomes from the HiFi reads generated for target species. The input for MitoHiFi is either the raw reads or the assembled contigs, and the tool outputs a mitochondrial genome sequence fasta file along with annotation of protein and RNA genes. Variants arising from heteroplasmy are assembled independently, and nuclear insertions of mitochondrial sequences are identified and not used in organellar genome assembly. MitoHiFi has been used to assemble 374 mitochondrial genomes (368 Metazoa and 6 Fungi species) for the Darwin Tree of Life Project, the Vertebrate Genomes Project and the Aquatic Symbiosis Genome Project. Inspection of 60 mitochondrial genomes assembled with MitoHiFi for species that already have reference sequences in public databases showed the widespread presence of previously unreported repeats. CONCLUSIONS:  MitoHiFi is able to assemble mitochondrial genomes from a wide phylogenetic range of taxa from Pacbio HiFi data. MitoHiFi is written in python and is freely available on GitHub (https://github.com/marcelauliano/MitoHiFi). MitoHiFi is available with its dependencies as a Docker container on GitHub (ghcr.io/marcelauliano/mitohifi:master). SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05385-y.
format Online
Article
Text
id pubmed-10354987
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-103549872023-07-20 MitoHiFi: a python pipeline for mitochondrial genome assembly from PacBio high fidelity reads Uliano-Silva, Marcela Ferreira, João Gabriel R. N. Krasheninnikova, Ksenia Formenti, Giulio Abueg, Linelle Torrance, James Myers, Eugene W. Durbin, Richard Blaxter, Mark McCarthy, Shane A. BMC Bioinformatics Software BACKGROUND:  PacBio high fidelity (HiFi) sequencing reads are both long (15–20 kb) and highly accurate (> Q20). Because of these properties, they have revolutionised genome assembly leading to more accurate and contiguous genomes. In eukaryotes the mitochondrial genome is sequenced alongside the nuclear genome often at very high coverage. A dedicated tool for mitochondrial genome assembly using HiFi reads is still missing. RESULTS:  MitoHiFi was developed within the Darwin Tree of Life Project to assemble mitochondrial genomes from the HiFi reads generated for target species. The input for MitoHiFi is either the raw reads or the assembled contigs, and the tool outputs a mitochondrial genome sequence fasta file along with annotation of protein and RNA genes. Variants arising from heteroplasmy are assembled independently, and nuclear insertions of mitochondrial sequences are identified and not used in organellar genome assembly. MitoHiFi has been used to assemble 374 mitochondrial genomes (368 Metazoa and 6 Fungi species) for the Darwin Tree of Life Project, the Vertebrate Genomes Project and the Aquatic Symbiosis Genome Project. Inspection of 60 mitochondrial genomes assembled with MitoHiFi for species that already have reference sequences in public databases showed the widespread presence of previously unreported repeats. CONCLUSIONS:  MitoHiFi is able to assemble mitochondrial genomes from a wide phylogenetic range of taxa from Pacbio HiFi data. MitoHiFi is written in python and is freely available on GitHub (https://github.com/marcelauliano/MitoHiFi). MitoHiFi is available with its dependencies as a Docker container on GitHub (ghcr.io/marcelauliano/mitohifi:master). SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05385-y. BioMed Central 2023-07-18 /pmc/articles/PMC10354987/ /pubmed/37464285 http://dx.doi.org/10.1186/s12859-023-05385-y Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Software
Uliano-Silva, Marcela
Ferreira, João Gabriel R. N.
Krasheninnikova, Ksenia
Formenti, Giulio
Abueg, Linelle
Torrance, James
Myers, Eugene W.
Durbin, Richard
Blaxter, Mark
McCarthy, Shane A.
MitoHiFi: a python pipeline for mitochondrial genome assembly from PacBio high fidelity reads
title MitoHiFi: a python pipeline for mitochondrial genome assembly from PacBio high fidelity reads
title_full MitoHiFi: a python pipeline for mitochondrial genome assembly from PacBio high fidelity reads
title_fullStr MitoHiFi: a python pipeline for mitochondrial genome assembly from PacBio high fidelity reads
title_full_unstemmed MitoHiFi: a python pipeline for mitochondrial genome assembly from PacBio high fidelity reads
title_short MitoHiFi: a python pipeline for mitochondrial genome assembly from PacBio high fidelity reads
title_sort mitohifi: a python pipeline for mitochondrial genome assembly from pacbio high fidelity reads
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10354987/
https://www.ncbi.nlm.nih.gov/pubmed/37464285
http://dx.doi.org/10.1186/s12859-023-05385-y
work_keys_str_mv AT ulianosilvamarcela mitohifiapythonpipelineformitochondrialgenomeassemblyfrompacbiohighfidelityreads
AT ferreirajoaogabrielrn mitohifiapythonpipelineformitochondrialgenomeassemblyfrompacbiohighfidelityreads
AT krasheninnikovaksenia mitohifiapythonpipelineformitochondrialgenomeassemblyfrompacbiohighfidelityreads
AT mitohifiapythonpipelineformitochondrialgenomeassemblyfrompacbiohighfidelityreads
AT formentigiulio mitohifiapythonpipelineformitochondrialgenomeassemblyfrompacbiohighfidelityreads
AT abueglinelle mitohifiapythonpipelineformitochondrialgenomeassemblyfrompacbiohighfidelityreads
AT torrancejames mitohifiapythonpipelineformitochondrialgenomeassemblyfrompacbiohighfidelityreads
AT myerseugenew mitohifiapythonpipelineformitochondrialgenomeassemblyfrompacbiohighfidelityreads
AT durbinrichard mitohifiapythonpipelineformitochondrialgenomeassemblyfrompacbiohighfidelityreads
AT blaxtermark mitohifiapythonpipelineformitochondrialgenomeassemblyfrompacbiohighfidelityreads
AT mccarthyshanea mitohifiapythonpipelineformitochondrialgenomeassemblyfrompacbiohighfidelityreads