Cargando…

Beyond Blast: Enabling Microbiologists to Better Extract Literature, Taxonomic Distributions and Gene Neighborhood Information for Protein Families

Capturing the published corpus of information on all members of a given protein family should be an essential step in any study focusing on any specific member of that said family. This step is often performed only superficially or partially by experimentalists as the most common approaches and tool...

Descripción completa

Detalles Bibliográficos
Autores principales: Reed, Colbie, Denise, Rémi, Hourihan, Jacob, Babor, Jill, Jaroch, Marshall, Martinelli, Maria, Hutinet, Geoffrey, de Crécy-Lagard, Valérie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10187207/
https://www.ncbi.nlm.nih.gov/pubmed/37205517
http://dx.doi.org/10.1101/2023.05.03.539116
_version_ 1785042703215493120
author Reed, Colbie
Denise, Rémi
Hourihan, Jacob
Babor, Jill
Jaroch, Marshall
Martinelli, Maria
Hutinet, Geoffrey
de Crécy-Lagard, Valérie
author_facet Reed, Colbie
Denise, Rémi
Hourihan, Jacob
Babor, Jill
Jaroch, Marshall
Martinelli, Maria
Hutinet, Geoffrey
de Crécy-Lagard, Valérie
author_sort Reed, Colbie
collection PubMed
description Capturing the published corpus of information on all members of a given protein family should be an essential step in any study focusing on any specific member of that said family. This step is often performed only superficially or partially by experimentalists as the most common approaches and tools to pursue this objective are far from optimal. Using a previously gathered dataset of 284 references mentioning a member of the DUF34 (NIF3/Ngg1-interacting Factor 3), we evaluated the productivity of different databases and search tools, and devised a workflow that can be used by experimentalists to capture the most information in less time. To complement this workflow, web-based platforms allowing for the exploration of member distributions for several protein families across sequenced genomes or for the capture of gene neighborhood information were reviewed for their versatility, completeness and ease of use. Recommendations that can be used for experimentalist users, as well as educators, are provided and integrated within a customized, publicly accessible Wiki.
format Online
Article
Text
id pubmed-10187207
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-101872072023-05-17 Beyond Blast: Enabling Microbiologists to Better Extract Literature, Taxonomic Distributions and Gene Neighborhood Information for Protein Families Reed, Colbie Denise, Rémi Hourihan, Jacob Babor, Jill Jaroch, Marshall Martinelli, Maria Hutinet, Geoffrey de Crécy-Lagard, Valérie bioRxiv Article Capturing the published corpus of information on all members of a given protein family should be an essential step in any study focusing on any specific member of that said family. This step is often performed only superficially or partially by experimentalists as the most common approaches and tools to pursue this objective are far from optimal. Using a previously gathered dataset of 284 references mentioning a member of the DUF34 (NIF3/Ngg1-interacting Factor 3), we evaluated the productivity of different databases and search tools, and devised a workflow that can be used by experimentalists to capture the most information in less time. To complement this workflow, web-based platforms allowing for the exploration of member distributions for several protein families across sequenced genomes or for the capture of gene neighborhood information were reviewed for their versatility, completeness and ease of use. Recommendations that can be used for experimentalist users, as well as educators, are provided and integrated within a customized, publicly accessible Wiki. Cold Spring Harbor Laboratory 2023-05-03 /pmc/articles/PMC10187207/ /pubmed/37205517 http://dx.doi.org/10.1101/2023.05.03.539116 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle Article
Reed, Colbie
Denise, Rémi
Hourihan, Jacob
Babor, Jill
Jaroch, Marshall
Martinelli, Maria
Hutinet, Geoffrey
de Crécy-Lagard, Valérie
Beyond Blast: Enabling Microbiologists to Better Extract Literature, Taxonomic Distributions and Gene Neighborhood Information for Protein Families
title Beyond Blast: Enabling Microbiologists to Better Extract Literature, Taxonomic Distributions and Gene Neighborhood Information for Protein Families
title_full Beyond Blast: Enabling Microbiologists to Better Extract Literature, Taxonomic Distributions and Gene Neighborhood Information for Protein Families
title_fullStr Beyond Blast: Enabling Microbiologists to Better Extract Literature, Taxonomic Distributions and Gene Neighborhood Information for Protein Families
title_full_unstemmed Beyond Blast: Enabling Microbiologists to Better Extract Literature, Taxonomic Distributions and Gene Neighborhood Information for Protein Families
title_short Beyond Blast: Enabling Microbiologists to Better Extract Literature, Taxonomic Distributions and Gene Neighborhood Information for Protein Families
title_sort beyond blast: enabling microbiologists to better extract literature, taxonomic distributions and gene neighborhood information for protein families
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10187207/
https://www.ncbi.nlm.nih.gov/pubmed/37205517
http://dx.doi.org/10.1101/2023.05.03.539116
work_keys_str_mv AT reedcolbie beyondblastenablingmicrobiologiststobetterextractliteraturetaxonomicdistributionsandgeneneighborhoodinformationforproteinfamilies
AT deniseremi beyondblastenablingmicrobiologiststobetterextractliteraturetaxonomicdistributionsandgeneneighborhoodinformationforproteinfamilies
AT hourihanjacob beyondblastenablingmicrobiologiststobetterextractliteraturetaxonomicdistributionsandgeneneighborhoodinformationforproteinfamilies
AT baborjill beyondblastenablingmicrobiologiststobetterextractliteraturetaxonomicdistributionsandgeneneighborhoodinformationforproteinfamilies
AT jarochmarshall beyondblastenablingmicrobiologiststobetterextractliteraturetaxonomicdistributionsandgeneneighborhoodinformationforproteinfamilies
AT martinellimaria beyondblastenablingmicrobiologiststobetterextractliteraturetaxonomicdistributionsandgeneneighborhoodinformationforproteinfamilies
AT hutinetgeoffrey beyondblastenablingmicrobiologiststobetterextractliteraturetaxonomicdistributionsandgeneneighborhoodinformationforproteinfamilies
AT decrecylagardvalerie beyondblastenablingmicrobiologiststobetterextractliteraturetaxonomicdistributionsandgeneneighborhoodinformationforproteinfamilies