Cargando…
Opfi: A Python package for identifying gene clusters in large genomics and metagenomics data sets
Gene clusters are sets of co-localized, often contiguous genes that together perform specific functions, many of which are relevant to biotechnology. There is a need for software tools that can extract candidate gene clusters from vast amounts of available genomic data. Therefore, we developed Opfi:...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9017871/ https://www.ncbi.nlm.nih.gov/pubmed/35445164 http://dx.doi.org/10.21105/joss.03678 |
_version_ | 1784688874285432832 |
---|---|
author | Hill, Alexis M. Rybarski, James R. Hu, Kuang Finkelstein, Ilya J. Wilke, Claus O. |
author_facet | Hill, Alexis M. Rybarski, James R. Hu, Kuang Finkelstein, Ilya J. Wilke, Claus O. |
author_sort | Hill, Alexis M. |
collection | PubMed |
description | Gene clusters are sets of co-localized, often contiguous genes that together perform specific functions, many of which are relevant to biotechnology. There is a need for software tools that can extract candidate gene clusters from vast amounts of available genomic data. Therefore, we developed Opfi: a modular pipeline for identification of arbitrary gene clusters in assembled genomic or metagenomic sequences. Opfi contains functions for annotation, de-deduplication, and visualization of putative gene clusters. It utilizes a customizable rule-based filtering approach for selection of candidate systems that adhere to user-defined criteria. Opfi is implemented in Python, and is available on the Python Package Index and on Bioconda (Grüning et al., 2018). |
format | Online Article Text |
id | pubmed-9017871 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
record_format | MEDLINE/PubMed |
spelling | pubmed-90178712022-04-19 Opfi: A Python package for identifying gene clusters in large genomics and metagenomics data sets Hill, Alexis M. Rybarski, James R. Hu, Kuang Finkelstein, Ilya J. Wilke, Claus O. J Open Source Softw Article Gene clusters are sets of co-localized, often contiguous genes that together perform specific functions, many of which are relevant to biotechnology. There is a need for software tools that can extract candidate gene clusters from vast amounts of available genomic data. Therefore, we developed Opfi: a modular pipeline for identification of arbitrary gene clusters in assembled genomic or metagenomic sequences. Opfi contains functions for annotation, de-deduplication, and visualization of putative gene clusters. It utilizes a customizable rule-based filtering approach for selection of candidate systems that adhere to user-defined criteria. Opfi is implemented in Python, and is available on the Python Package Index and on Bioconda (Grüning et al., 2018). 2021 2021-10-27 /pmc/articles/PMC9017871/ /pubmed/35445164 http://dx.doi.org/10.21105/joss.03678 Text en https://creativecommons.org/licenses/by/4.0/License Authors of papers retain copyright and release the work under a Creative Commons Attribution 4.0 International License (CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/) ). |
spellingShingle | Article Hill, Alexis M. Rybarski, James R. Hu, Kuang Finkelstein, Ilya J. Wilke, Claus O. Opfi: A Python package for identifying gene clusters in large genomics and metagenomics data sets |
title | Opfi: A Python package for identifying gene clusters in large genomics and metagenomics data sets |
title_full | Opfi: A Python package for identifying gene clusters in large genomics and metagenomics data sets |
title_fullStr | Opfi: A Python package for identifying gene clusters in large genomics and metagenomics data sets |
title_full_unstemmed | Opfi: A Python package for identifying gene clusters in large genomics and metagenomics data sets |
title_short | Opfi: A Python package for identifying gene clusters in large genomics and metagenomics data sets |
title_sort | opfi: a python package for identifying gene clusters in large genomics and metagenomics data sets |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9017871/ https://www.ncbi.nlm.nih.gov/pubmed/35445164 http://dx.doi.org/10.21105/joss.03678 |
work_keys_str_mv | AT hillalexism opfiapythonpackageforidentifyinggeneclustersinlargegenomicsandmetagenomicsdatasets AT rybarskijamesr opfiapythonpackageforidentifyinggeneclustersinlargegenomicsandmetagenomicsdatasets AT hukuang opfiapythonpackageforidentifyinggeneclustersinlargegenomicsandmetagenomicsdatasets AT finkelsteinilyaj opfiapythonpackageforidentifyinggeneclustersinlargegenomicsandmetagenomicsdatasets AT wilkeclauso opfiapythonpackageforidentifyinggeneclustersinlargegenomicsandmetagenomicsdatasets |