Cargando…
orthofisher: a broadly applicable tool for automated gene identification and retrieval
Identification and retrieval of genes of interest from genomic data are an essential step for many bioinformatic applications. We present orthofisher, a command-line tool for automated identification and retrieval of genes with high sequence similarity to a query profile Hidden Markov Model sequence...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8496211/ https://www.ncbi.nlm.nih.gov/pubmed/34544141 http://dx.doi.org/10.1093/g3journal/jkab250 |
_version_ | 1784579706893369344 |
---|---|
author | Steenwyk, Jacob L Rokas, Antonis |
author_facet | Steenwyk, Jacob L Rokas, Antonis |
author_sort | Steenwyk, Jacob L |
collection | PubMed |
description | Identification and retrieval of genes of interest from genomic data are an essential step for many bioinformatic applications. We present orthofisher, a command-line tool for automated identification and retrieval of genes with high sequence similarity to a query profile Hidden Markov Model sequence alignment across a set of proteomes. Performance assessment of orthofisher revealed high accuracy and precision during single-copy orthologous gene identification. orthofisher may be useful for assessing gene annotation quality, identifying single-copy orthologous genes for phylogenomic analyses, estimating gene copy number, and other evolutionary analyses that rely on identification and retrieval of homologous genes from genomic data. orthofisher comes complete with comprehensive documentation (https://jlsteenwyk.com/orthofisher/), is freely available under the MIT license, and is available for download from GitHub (https://github.com/JLSteenwyk/orthofisher), PyPi (https://pypi.org/project/orthofisher/), and the Anaconda Cloud (https://anaconda.org/jlsteenwyk/orthofisher). |
format | Online Article Text |
id | pubmed-8496211 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-84962112021-10-07 orthofisher: a broadly applicable tool for automated gene identification and retrieval Steenwyk, Jacob L Rokas, Antonis G3 (Bethesda) Software and Data Resources Identification and retrieval of genes of interest from genomic data are an essential step for many bioinformatic applications. We present orthofisher, a command-line tool for automated identification and retrieval of genes with high sequence similarity to a query profile Hidden Markov Model sequence alignment across a set of proteomes. Performance assessment of orthofisher revealed high accuracy and precision during single-copy orthologous gene identification. orthofisher may be useful for assessing gene annotation quality, identifying single-copy orthologous genes for phylogenomic analyses, estimating gene copy number, and other evolutionary analyses that rely on identification and retrieval of homologous genes from genomic data. orthofisher comes complete with comprehensive documentation (https://jlsteenwyk.com/orthofisher/), is freely available under the MIT license, and is available for download from GitHub (https://github.com/JLSteenwyk/orthofisher), PyPi (https://pypi.org/project/orthofisher/), and the Anaconda Cloud (https://anaconda.org/jlsteenwyk/orthofisher). Oxford University Press 2021-07-15 /pmc/articles/PMC8496211/ /pubmed/34544141 http://dx.doi.org/10.1093/g3journal/jkab250 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of Genetics Society of America. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Software and Data Resources Steenwyk, Jacob L Rokas, Antonis orthofisher: a broadly applicable tool for automated gene identification and retrieval |
title | orthofisher: a broadly applicable tool for automated gene identification and retrieval |
title_full | orthofisher: a broadly applicable tool for automated gene identification and retrieval |
title_fullStr | orthofisher: a broadly applicable tool for automated gene identification and retrieval |
title_full_unstemmed | orthofisher: a broadly applicable tool for automated gene identification and retrieval |
title_short | orthofisher: a broadly applicable tool for automated gene identification and retrieval |
title_sort | orthofisher: a broadly applicable tool for automated gene identification and retrieval |
topic | Software and Data Resources |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8496211/ https://www.ncbi.nlm.nih.gov/pubmed/34544141 http://dx.doi.org/10.1093/g3journal/jkab250 |
work_keys_str_mv | AT steenwykjacobl orthofisherabroadlyapplicabletoolforautomatedgeneidentificationandretrieval AT rokasantonis orthofisherabroadlyapplicabletoolforautomatedgeneidentificationandretrieval |