Cargando…

Automatic identification of optimal marker genes for phenotypic and taxonomic groups of microorganisms

Finding optimal markers for microorganisms important in the medical, agricultural, environmental or ecological fields is of great importance. Thousands of complete microbial genomes now available allow us, for the first time, to exhaustively identify marker proteins for groups of microbial organisms...

Descripción completa

Detalles Bibliográficos
Autores principales: Segev, Elad, Pasternak, Zohar, Ben Sasson, Tom, Jurkevitch, Edouard, Gonen, Mira
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5931505/
https://www.ncbi.nlm.nih.gov/pubmed/29718935
http://dx.doi.org/10.1371/journal.pone.0195537
_version_ 1783319648267141120
author Segev, Elad
Pasternak, Zohar
Ben Sasson, Tom
Jurkevitch, Edouard
Gonen, Mira
author_facet Segev, Elad
Pasternak, Zohar
Ben Sasson, Tom
Jurkevitch, Edouard
Gonen, Mira
author_sort Segev, Elad
collection PubMed
description Finding optimal markers for microorganisms important in the medical, agricultural, environmental or ecological fields is of great importance. Thousands of complete microbial genomes now available allow us, for the first time, to exhaustively identify marker proteins for groups of microbial organisms. In this work, we model the biological task as the well-known mathematical “hitting set” problem, solving it based on both greedy and randomized approximation algorithms. We identify unique markers for 17 phenotypic and taxonomic microbial groups, including proteins related to the nitrite reductase enzyme as markers for the non-anammox nitrifying bacteria group, and two transcription regulation proteins, nusG and yhiF, as markers for the Archaea and Escherichia/Shigella taxonomic groups, respectively. Additionally, we identify marker proteins for three subtypes of pathogenic E. coli, which previously had no known optimal markers. Practically, depending on the completeness of the database this algorithm can be used for identification of marker genes for any microbial group, these marker genes may be prime candidates for the understanding of the genetic basis of the group's phenotype or to help discover novel functions which are uniquely shared among a group of microbes. We show that our method is both theoretically and practically efficient, while establishing an upper bound on its time complexity and approximation ratio; thus, it promises to remain efficient and permit the identification of marker proteins that are specific to phenotypic or taxonomic groups, even as more and more bacterial genomes are being sequenced.
format Online
Article
Text
id pubmed-5931505
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-59315052018-05-11 Automatic identification of optimal marker genes for phenotypic and taxonomic groups of microorganisms Segev, Elad Pasternak, Zohar Ben Sasson, Tom Jurkevitch, Edouard Gonen, Mira PLoS One Research Article Finding optimal markers for microorganisms important in the medical, agricultural, environmental or ecological fields is of great importance. Thousands of complete microbial genomes now available allow us, for the first time, to exhaustively identify marker proteins for groups of microbial organisms. In this work, we model the biological task as the well-known mathematical “hitting set” problem, solving it based on both greedy and randomized approximation algorithms. We identify unique markers for 17 phenotypic and taxonomic microbial groups, including proteins related to the nitrite reductase enzyme as markers for the non-anammox nitrifying bacteria group, and two transcription regulation proteins, nusG and yhiF, as markers for the Archaea and Escherichia/Shigella taxonomic groups, respectively. Additionally, we identify marker proteins for three subtypes of pathogenic E. coli, which previously had no known optimal markers. Practically, depending on the completeness of the database this algorithm can be used for identification of marker genes for any microbial group, these marker genes may be prime candidates for the understanding of the genetic basis of the group's phenotype or to help discover novel functions which are uniquely shared among a group of microbes. We show that our method is both theoretically and practically efficient, while establishing an upper bound on its time complexity and approximation ratio; thus, it promises to remain efficient and permit the identification of marker proteins that are specific to phenotypic or taxonomic groups, even as more and more bacterial genomes are being sequenced. Public Library of Science 2018-05-02 /pmc/articles/PMC5931505/ /pubmed/29718935 http://dx.doi.org/10.1371/journal.pone.0195537 Text en © 2018 Segev et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Segev, Elad
Pasternak, Zohar
Ben Sasson, Tom
Jurkevitch, Edouard
Gonen, Mira
Automatic identification of optimal marker genes for phenotypic and taxonomic groups of microorganisms
title Automatic identification of optimal marker genes for phenotypic and taxonomic groups of microorganisms
title_full Automatic identification of optimal marker genes for phenotypic and taxonomic groups of microorganisms
title_fullStr Automatic identification of optimal marker genes for phenotypic and taxonomic groups of microorganisms
title_full_unstemmed Automatic identification of optimal marker genes for phenotypic and taxonomic groups of microorganisms
title_short Automatic identification of optimal marker genes for phenotypic and taxonomic groups of microorganisms
title_sort automatic identification of optimal marker genes for phenotypic and taxonomic groups of microorganisms
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5931505/
https://www.ncbi.nlm.nih.gov/pubmed/29718935
http://dx.doi.org/10.1371/journal.pone.0195537
work_keys_str_mv AT segevelad automaticidentificationofoptimalmarkergenesforphenotypicandtaxonomicgroupsofmicroorganisms
AT pasternakzohar automaticidentificationofoptimalmarkergenesforphenotypicandtaxonomicgroupsofmicroorganisms
AT bensassontom automaticidentificationofoptimalmarkergenesforphenotypicandtaxonomicgroupsofmicroorganisms
AT jurkevitchedouard automaticidentificationofoptimalmarkergenesforphenotypicandtaxonomicgroupsofmicroorganisms
AT gonenmira automaticidentificationofoptimalmarkergenesforphenotypicandtaxonomicgroupsofmicroorganisms