Cargando…
Automatic identification of optimal marker genes for phenotypic and taxonomic groups of microorganisms
Finding optimal markers for microorganisms important in the medical, agricultural, environmental or ecological fields is of great importance. Thousands of complete microbial genomes now available allow us, for the first time, to exhaustively identify marker proteins for groups of microbial organisms...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5931505/ https://www.ncbi.nlm.nih.gov/pubmed/29718935 http://dx.doi.org/10.1371/journal.pone.0195537 |
_version_ | 1783319648267141120 |
---|---|
author | Segev, Elad Pasternak, Zohar Ben Sasson, Tom Jurkevitch, Edouard Gonen, Mira |
author_facet | Segev, Elad Pasternak, Zohar Ben Sasson, Tom Jurkevitch, Edouard Gonen, Mira |
author_sort | Segev, Elad |
collection | PubMed |
description | Finding optimal markers for microorganisms important in the medical, agricultural, environmental or ecological fields is of great importance. Thousands of complete microbial genomes now available allow us, for the first time, to exhaustively identify marker proteins for groups of microbial organisms. In this work, we model the biological task as the well-known mathematical “hitting set” problem, solving it based on both greedy and randomized approximation algorithms. We identify unique markers for 17 phenotypic and taxonomic microbial groups, including proteins related to the nitrite reductase enzyme as markers for the non-anammox nitrifying bacteria group, and two transcription regulation proteins, nusG and yhiF, as markers for the Archaea and Escherichia/Shigella taxonomic groups, respectively. Additionally, we identify marker proteins for three subtypes of pathogenic E. coli, which previously had no known optimal markers. Practically, depending on the completeness of the database this algorithm can be used for identification of marker genes for any microbial group, these marker genes may be prime candidates for the understanding of the genetic basis of the group's phenotype or to help discover novel functions which are uniquely shared among a group of microbes. We show that our method is both theoretically and practically efficient, while establishing an upper bound on its time complexity and approximation ratio; thus, it promises to remain efficient and permit the identification of marker proteins that are specific to phenotypic or taxonomic groups, even as more and more bacterial genomes are being sequenced. |
format | Online Article Text |
id | pubmed-5931505 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-59315052018-05-11 Automatic identification of optimal marker genes for phenotypic and taxonomic groups of microorganisms Segev, Elad Pasternak, Zohar Ben Sasson, Tom Jurkevitch, Edouard Gonen, Mira PLoS One Research Article Finding optimal markers for microorganisms important in the medical, agricultural, environmental or ecological fields is of great importance. Thousands of complete microbial genomes now available allow us, for the first time, to exhaustively identify marker proteins for groups of microbial organisms. In this work, we model the biological task as the well-known mathematical “hitting set” problem, solving it based on both greedy and randomized approximation algorithms. We identify unique markers for 17 phenotypic and taxonomic microbial groups, including proteins related to the nitrite reductase enzyme as markers for the non-anammox nitrifying bacteria group, and two transcription regulation proteins, nusG and yhiF, as markers for the Archaea and Escherichia/Shigella taxonomic groups, respectively. Additionally, we identify marker proteins for three subtypes of pathogenic E. coli, which previously had no known optimal markers. Practically, depending on the completeness of the database this algorithm can be used for identification of marker genes for any microbial group, these marker genes may be prime candidates for the understanding of the genetic basis of the group's phenotype or to help discover novel functions which are uniquely shared among a group of microbes. We show that our method is both theoretically and practically efficient, while establishing an upper bound on its time complexity and approximation ratio; thus, it promises to remain efficient and permit the identification of marker proteins that are specific to phenotypic or taxonomic groups, even as more and more bacterial genomes are being sequenced. Public Library of Science 2018-05-02 /pmc/articles/PMC5931505/ /pubmed/29718935 http://dx.doi.org/10.1371/journal.pone.0195537 Text en © 2018 Segev et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Segev, Elad Pasternak, Zohar Ben Sasson, Tom Jurkevitch, Edouard Gonen, Mira Automatic identification of optimal marker genes for phenotypic and taxonomic groups of microorganisms |
title | Automatic identification of optimal marker genes for phenotypic and taxonomic groups of microorganisms |
title_full | Automatic identification of optimal marker genes for phenotypic and taxonomic groups of microorganisms |
title_fullStr | Automatic identification of optimal marker genes for phenotypic and taxonomic groups of microorganisms |
title_full_unstemmed | Automatic identification of optimal marker genes for phenotypic and taxonomic groups of microorganisms |
title_short | Automatic identification of optimal marker genes for phenotypic and taxonomic groups of microorganisms |
title_sort | automatic identification of optimal marker genes for phenotypic and taxonomic groups of microorganisms |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5931505/ https://www.ncbi.nlm.nih.gov/pubmed/29718935 http://dx.doi.org/10.1371/journal.pone.0195537 |
work_keys_str_mv | AT segevelad automaticidentificationofoptimalmarkergenesforphenotypicandtaxonomicgroupsofmicroorganisms AT pasternakzohar automaticidentificationofoptimalmarkergenesforphenotypicandtaxonomicgroupsofmicroorganisms AT bensassontom automaticidentificationofoptimalmarkergenesforphenotypicandtaxonomicgroupsofmicroorganisms AT jurkevitchedouard automaticidentificationofoptimalmarkergenesforphenotypicandtaxonomicgroupsofmicroorganisms AT gonenmira automaticidentificationofoptimalmarkergenesforphenotypicandtaxonomicgroupsofmicroorganisms |