Cargando…

MultiMSOAR 2.0: An Accurate Tool to Identify Ortholog Groups among Multiple Genomes

The identification of orthologous genes shared by multiple genomes plays an important role in evolutionary studies and gene functional analyses. Based on a recently developed accurate tool, called MSOAR 2.0, for ortholog assignment between a pair of closely related genomes based on genome rearrangem...

Descripción completa

Detalles Bibliográficos
Autores principales: Shi, Guanqun, Peng, Meng-Chih, Jiang, Tao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3119667/
https://www.ncbi.nlm.nih.gov/pubmed/21712981
http://dx.doi.org/10.1371/journal.pone.0020892
_version_ 1782206588987113472
author Shi, Guanqun
Peng, Meng-Chih
Jiang, Tao
author_facet Shi, Guanqun
Peng, Meng-Chih
Jiang, Tao
author_sort Shi, Guanqun
collection PubMed
description The identification of orthologous genes shared by multiple genomes plays an important role in evolutionary studies and gene functional analyses. Based on a recently developed accurate tool, called MSOAR 2.0, for ortholog assignment between a pair of closely related genomes based on genome rearrangement, we present a new system MultiMSOAR 2.0, to identify ortholog groups among multiple genomes in this paper. In the system, we construct gene families for all the genomes using sequence similarity search and clustering, run MSOAR 2.0 for all pairs of genomes to obtain the pairwise orthology relationship, and partition each gene family into a set of disjoint sets of orthologous genes (called super ortholog groups or SOGs) such that each SOG contains at most one gene from each genome. For each such SOG, we label the leaves of the species tree using 1 or 0 to indicate if the SOG contains a gene from the corresponding species or not. The resulting tree is called a tree of ortholog groups (or TOGs). We then label the internal nodes of each TOG based on the parsimony principle and some biological constraints. Ortholog groups are finally identified from each fully labeled TOG. In comparison with a popular tool MultiParanoid on simulated data, MultiMSOAR 2.0 shows significantly higher prediction accuracy. It also outperforms MultiParanoid, the Roundup multi-ortholog repository and the Ensembl ortholog database in real data experiments using gene symbols as a validation tool. In addition to ortholog group identification, MultiMSOAR 2.0 also provides information about gene births, duplications and losses in evolution, which may be of independent biological interest. Our experiments on simulated data demonstrate that MultiMSOAR 2.0 is able to infer these evolutionary events much more accurately than a well-known software tool Notung. The software MultiMSOAR 2.0 is available to the public for free.
format Online
Article
Text
id pubmed-3119667
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-31196672011-06-27 MultiMSOAR 2.0: An Accurate Tool to Identify Ortholog Groups among Multiple Genomes Shi, Guanqun Peng, Meng-Chih Jiang, Tao PLoS One Research Article The identification of orthologous genes shared by multiple genomes plays an important role in evolutionary studies and gene functional analyses. Based on a recently developed accurate tool, called MSOAR 2.0, for ortholog assignment between a pair of closely related genomes based on genome rearrangement, we present a new system MultiMSOAR 2.0, to identify ortholog groups among multiple genomes in this paper. In the system, we construct gene families for all the genomes using sequence similarity search and clustering, run MSOAR 2.0 for all pairs of genomes to obtain the pairwise orthology relationship, and partition each gene family into a set of disjoint sets of orthologous genes (called super ortholog groups or SOGs) such that each SOG contains at most one gene from each genome. For each such SOG, we label the leaves of the species tree using 1 or 0 to indicate if the SOG contains a gene from the corresponding species or not. The resulting tree is called a tree of ortholog groups (or TOGs). We then label the internal nodes of each TOG based on the parsimony principle and some biological constraints. Ortholog groups are finally identified from each fully labeled TOG. In comparison with a popular tool MultiParanoid on simulated data, MultiMSOAR 2.0 shows significantly higher prediction accuracy. It also outperforms MultiParanoid, the Roundup multi-ortholog repository and the Ensembl ortholog database in real data experiments using gene symbols as a validation tool. In addition to ortholog group identification, MultiMSOAR 2.0 also provides information about gene births, duplications and losses in evolution, which may be of independent biological interest. Our experiments on simulated data demonstrate that MultiMSOAR 2.0 is able to infer these evolutionary events much more accurately than a well-known software tool Notung. The software MultiMSOAR 2.0 is available to the public for free. Public Library of Science 2011-06-21 /pmc/articles/PMC3119667/ /pubmed/21712981 http://dx.doi.org/10.1371/journal.pone.0020892 Text en Shi et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Shi, Guanqun
Peng, Meng-Chih
Jiang, Tao
MultiMSOAR 2.0: An Accurate Tool to Identify Ortholog Groups among Multiple Genomes
title MultiMSOAR 2.0: An Accurate Tool to Identify Ortholog Groups among Multiple Genomes
title_full MultiMSOAR 2.0: An Accurate Tool to Identify Ortholog Groups among Multiple Genomes
title_fullStr MultiMSOAR 2.0: An Accurate Tool to Identify Ortholog Groups among Multiple Genomes
title_full_unstemmed MultiMSOAR 2.0: An Accurate Tool to Identify Ortholog Groups among Multiple Genomes
title_short MultiMSOAR 2.0: An Accurate Tool to Identify Ortholog Groups among Multiple Genomes
title_sort multimsoar 2.0: an accurate tool to identify ortholog groups among multiple genomes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3119667/
https://www.ncbi.nlm.nih.gov/pubmed/21712981
http://dx.doi.org/10.1371/journal.pone.0020892
work_keys_str_mv AT shiguanqun multimsoar20anaccuratetooltoidentifyorthologgroupsamongmultiplegenomes
AT pengmengchih multimsoar20anaccuratetooltoidentifyorthologgroupsamongmultiplegenomes
AT jiangtao multimsoar20anaccuratetooltoidentifyorthologgroupsamongmultiplegenomes