Cargando…

orthofisher: a broadly applicable tool for automated gene identification and retrieval

Identification and retrieval of genes of interest from genomic data are an essential step for many bioinformatic applications. We present orthofisher, a command-line tool for automated identification and retrieval of genes with high sequence similarity to a query profile Hidden Markov Model sequence...

Descripción completa

Detalles Bibliográficos
Autores principales: Steenwyk, Jacob L, Rokas, Antonis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8496211/
https://www.ncbi.nlm.nih.gov/pubmed/34544141
http://dx.doi.org/10.1093/g3journal/jkab250
_version_ 1784579706893369344
author Steenwyk, Jacob L
Rokas, Antonis
author_facet Steenwyk, Jacob L
Rokas, Antonis
author_sort Steenwyk, Jacob L
collection PubMed
description Identification and retrieval of genes of interest from genomic data are an essential step for many bioinformatic applications. We present orthofisher, a command-line tool for automated identification and retrieval of genes with high sequence similarity to a query profile Hidden Markov Model sequence alignment across a set of proteomes. Performance assessment of orthofisher revealed high accuracy and precision during single-copy orthologous gene identification. orthofisher may be useful for assessing gene annotation quality, identifying single-copy orthologous genes for phylogenomic analyses, estimating gene copy number, and other evolutionary analyses that rely on identification and retrieval of homologous genes from genomic data. orthofisher comes complete with comprehensive documentation (https://jlsteenwyk.com/orthofisher/), is freely available under the MIT license, and is available for download from GitHub (https://github.com/JLSteenwyk/orthofisher), PyPi (https://pypi.org/project/orthofisher/), and the Anaconda Cloud (https://anaconda.org/jlsteenwyk/orthofisher).
format Online
Article
Text
id pubmed-8496211
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-84962112021-10-07 orthofisher: a broadly applicable tool for automated gene identification and retrieval Steenwyk, Jacob L Rokas, Antonis G3 (Bethesda) Software and Data Resources Identification and retrieval of genes of interest from genomic data are an essential step for many bioinformatic applications. We present orthofisher, a command-line tool for automated identification and retrieval of genes with high sequence similarity to a query profile Hidden Markov Model sequence alignment across a set of proteomes. Performance assessment of orthofisher revealed high accuracy and precision during single-copy orthologous gene identification. orthofisher may be useful for assessing gene annotation quality, identifying single-copy orthologous genes for phylogenomic analyses, estimating gene copy number, and other evolutionary analyses that rely on identification and retrieval of homologous genes from genomic data. orthofisher comes complete with comprehensive documentation (https://jlsteenwyk.com/orthofisher/), is freely available under the MIT license, and is available for download from GitHub (https://github.com/JLSteenwyk/orthofisher), PyPi (https://pypi.org/project/orthofisher/), and the Anaconda Cloud (https://anaconda.org/jlsteenwyk/orthofisher). Oxford University Press 2021-07-15 /pmc/articles/PMC8496211/ /pubmed/34544141 http://dx.doi.org/10.1093/g3journal/jkab250 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of Genetics Society of America. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software and Data Resources
Steenwyk, Jacob L
Rokas, Antonis
orthofisher: a broadly applicable tool for automated gene identification and retrieval
title orthofisher: a broadly applicable tool for automated gene identification and retrieval
title_full orthofisher: a broadly applicable tool for automated gene identification and retrieval
title_fullStr orthofisher: a broadly applicable tool for automated gene identification and retrieval
title_full_unstemmed orthofisher: a broadly applicable tool for automated gene identification and retrieval
title_short orthofisher: a broadly applicable tool for automated gene identification and retrieval
title_sort orthofisher: a broadly applicable tool for automated gene identification and retrieval
topic Software and Data Resources
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8496211/
https://www.ncbi.nlm.nih.gov/pubmed/34544141
http://dx.doi.org/10.1093/g3journal/jkab250
work_keys_str_mv AT steenwykjacobl orthofisherabroadlyapplicabletoolforautomatedgeneidentificationandretrieval
AT rokasantonis orthofisherabroadlyapplicabletoolforautomatedgeneidentificationandretrieval