Cargando…

Protocol for using NoBadWordsCombiner to merge and minimize “bad words” from BLAST hits against multiple eukaryotic gene annotation databases

Annotating protein-coding genes can be challenging, especially when searching for the best hits against multiple functional databases. This is partly because of "bad words" appearing as top hits, such as hypothetical or uncharacterized proteins. To help alleviate some of these issues, we d...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Xi, Hu, Yining, Smith, David Roy
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8521201/
https://www.ncbi.nlm.nih.gov/pubmed/34704076
http://dx.doi.org/10.1016/j.xpro.2021.100888
_version_ 1784584852265238528
author Zhang, Xi
Hu, Yining
Smith, David Roy
author_facet Zhang, Xi
Hu, Yining
Smith, David Roy
author_sort Zhang, Xi
collection PubMed
description Annotating protein-coding genes can be challenging, especially when searching for the best hits against multiple functional databases. This is partly because of "bad words" appearing as top hits, such as hypothetical or uncharacterized proteins. To help alleviate some of these issues, we designed a bioinformatics tool called NoBadWordsCombiner, which efficiently merges the hits from various databases, strengthening gene definitions by minimizing functional descriptions containing "bad words." Unlike other available tools, NoBadWordsCombiner is user friendly, but it does require users to have some general bioinformatics skills, including a basic understanding of the BLAST package and dash shell in Linux/Unix environments. For complete details on the use and execution of this protocol, please refer to Zhang et al. (2021a).
format Online
Article
Text
id pubmed-8521201
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-85212012021-10-25 Protocol for using NoBadWordsCombiner to merge and minimize “bad words” from BLAST hits against multiple eukaryotic gene annotation databases Zhang, Xi Hu, Yining Smith, David Roy STAR Protoc Protocol Annotating protein-coding genes can be challenging, especially when searching for the best hits against multiple functional databases. This is partly because of "bad words" appearing as top hits, such as hypothetical or uncharacterized proteins. To help alleviate some of these issues, we designed a bioinformatics tool called NoBadWordsCombiner, which efficiently merges the hits from various databases, strengthening gene definitions by minimizing functional descriptions containing "bad words." Unlike other available tools, NoBadWordsCombiner is user friendly, but it does require users to have some general bioinformatics skills, including a basic understanding of the BLAST package and dash shell in Linux/Unix environments. For complete details on the use and execution of this protocol, please refer to Zhang et al. (2021a). Elsevier 2021-10-16 /pmc/articles/PMC8521201/ /pubmed/34704076 http://dx.doi.org/10.1016/j.xpro.2021.100888 Text en © 2021 The Author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Protocol
Zhang, Xi
Hu, Yining
Smith, David Roy
Protocol for using NoBadWordsCombiner to merge and minimize “bad words” from BLAST hits against multiple eukaryotic gene annotation databases
title Protocol for using NoBadWordsCombiner to merge and minimize “bad words” from BLAST hits against multiple eukaryotic gene annotation databases
title_full Protocol for using NoBadWordsCombiner to merge and minimize “bad words” from BLAST hits against multiple eukaryotic gene annotation databases
title_fullStr Protocol for using NoBadWordsCombiner to merge and minimize “bad words” from BLAST hits against multiple eukaryotic gene annotation databases
title_full_unstemmed Protocol for using NoBadWordsCombiner to merge and minimize “bad words” from BLAST hits against multiple eukaryotic gene annotation databases
title_short Protocol for using NoBadWordsCombiner to merge and minimize “bad words” from BLAST hits against multiple eukaryotic gene annotation databases
title_sort protocol for using nobadwordscombiner to merge and minimize “bad words” from blast hits against multiple eukaryotic gene annotation databases
topic Protocol
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8521201/
https://www.ncbi.nlm.nih.gov/pubmed/34704076
http://dx.doi.org/10.1016/j.xpro.2021.100888
work_keys_str_mv AT zhangxi protocolforusingnobadwordscombinertomergeandminimizebadwordsfromblasthitsagainstmultipleeukaryoticgeneannotationdatabases
AT huyining protocolforusingnobadwordscombinertomergeandminimizebadwordsfromblasthitsagainstmultipleeukaryoticgeneannotationdatabases
AT smithdavidroy protocolforusingnobadwordscombinertomergeandminimizebadwordsfromblasthitsagainstmultipleeukaryoticgeneannotationdatabases