Cargando…
A Meta-Analysis Based Method for Prioritizing Candidate Genes Involved in a Pre-specific Function
The identification of genes associated with a given biological function in plants remains a challenge, although network-based gene prioritization algorithms have been developed for Arabidopsis thaliana and many non-model plant species. Nevertheless, these network-based gene prioritization algorithms...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5156684/ https://www.ncbi.nlm.nih.gov/pubmed/28018423 http://dx.doi.org/10.3389/fpls.2016.01914 |
_version_ | 1782481301126774784 |
---|---|
author | Zhai, Jingjing Tang, Yunjia Yuan, Hao Wang, Longteng Shang, Haoli Ma, Chuang |
author_facet | Zhai, Jingjing Tang, Yunjia Yuan, Hao Wang, Longteng Shang, Haoli Ma, Chuang |
author_sort | Zhai, Jingjing |
collection | PubMed |
description | The identification of genes associated with a given biological function in plants remains a challenge, although network-based gene prioritization algorithms have been developed for Arabidopsis thaliana and many non-model plant species. Nevertheless, these network-based gene prioritization algorithms have encountered several problems; one in particular is that of unsatisfactory prediction accuracy due to limited network coverage, varying link quality, and/or uncertain network connectivity. Thus, a model that integrates complementary biological data may be expected to increase the prediction accuracy of gene prioritization. Toward this goal, we developed a novel gene prioritization method named RafSee, to rank candidate genes using a random forest algorithm that integrates sequence, evolutionary, and epigenetic features of plants. Subsequently, we proposed an integrative approach named RAP (Rank Aggregation-based data fusion for gene Prioritization), in which an order statistics-based meta-analysis was used to aggregate the rank of the network-based gene prioritization method and RafSee, for accurately prioritizing candidate genes involved in a pre-specific biological function. Finally, we showcased the utility of RAP by prioritizing 380 flowering-time genes in Arabidopsis. The “leave-one-out” cross-validation experiment showed that RafSee could work as a complement to a current state-of-art network-based gene prioritization system (AraNet v2). Moreover, RAP ranked 53.68% (204/380) flowering-time genes higher than AraNet v2, resulting in an 39.46% improvement in term of the first quartile rank. Further evaluations also showed that RAP was effective in prioritizing genes-related to different abiotic stresses. To enhance the usability of RAP for Arabidopsis and non-model plant species, an R package implementing the method is freely available at http://bioinfo.nwafu.edu.cn/software. |
format | Online Article Text |
id | pubmed-5156684 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-51566842016-12-23 A Meta-Analysis Based Method for Prioritizing Candidate Genes Involved in a Pre-specific Function Zhai, Jingjing Tang, Yunjia Yuan, Hao Wang, Longteng Shang, Haoli Ma, Chuang Front Plant Sci Plant Science The identification of genes associated with a given biological function in plants remains a challenge, although network-based gene prioritization algorithms have been developed for Arabidopsis thaliana and many non-model plant species. Nevertheless, these network-based gene prioritization algorithms have encountered several problems; one in particular is that of unsatisfactory prediction accuracy due to limited network coverage, varying link quality, and/or uncertain network connectivity. Thus, a model that integrates complementary biological data may be expected to increase the prediction accuracy of gene prioritization. Toward this goal, we developed a novel gene prioritization method named RafSee, to rank candidate genes using a random forest algorithm that integrates sequence, evolutionary, and epigenetic features of plants. Subsequently, we proposed an integrative approach named RAP (Rank Aggregation-based data fusion for gene Prioritization), in which an order statistics-based meta-analysis was used to aggregate the rank of the network-based gene prioritization method and RafSee, for accurately prioritizing candidate genes involved in a pre-specific biological function. Finally, we showcased the utility of RAP by prioritizing 380 flowering-time genes in Arabidopsis. The “leave-one-out” cross-validation experiment showed that RafSee could work as a complement to a current state-of-art network-based gene prioritization system (AraNet v2). Moreover, RAP ranked 53.68% (204/380) flowering-time genes higher than AraNet v2, resulting in an 39.46% improvement in term of the first quartile rank. Further evaluations also showed that RAP was effective in prioritizing genes-related to different abiotic stresses. To enhance the usability of RAP for Arabidopsis and non-model plant species, an R package implementing the method is freely available at http://bioinfo.nwafu.edu.cn/software. Frontiers Media S.A. 2016-12-15 /pmc/articles/PMC5156684/ /pubmed/28018423 http://dx.doi.org/10.3389/fpls.2016.01914 Text en Copyright © 2016 Zhai, Tang, Yuan, Wang, Shang and Ma. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Plant Science Zhai, Jingjing Tang, Yunjia Yuan, Hao Wang, Longteng Shang, Haoli Ma, Chuang A Meta-Analysis Based Method for Prioritizing Candidate Genes Involved in a Pre-specific Function |
title | A Meta-Analysis Based Method for Prioritizing Candidate Genes Involved in a Pre-specific Function |
title_full | A Meta-Analysis Based Method for Prioritizing Candidate Genes Involved in a Pre-specific Function |
title_fullStr | A Meta-Analysis Based Method for Prioritizing Candidate Genes Involved in a Pre-specific Function |
title_full_unstemmed | A Meta-Analysis Based Method for Prioritizing Candidate Genes Involved in a Pre-specific Function |
title_short | A Meta-Analysis Based Method for Prioritizing Candidate Genes Involved in a Pre-specific Function |
title_sort | meta-analysis based method for prioritizing candidate genes involved in a pre-specific function |
topic | Plant Science |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5156684/ https://www.ncbi.nlm.nih.gov/pubmed/28018423 http://dx.doi.org/10.3389/fpls.2016.01914 |
work_keys_str_mv | AT zhaijingjing ametaanalysisbasedmethodforprioritizingcandidategenesinvolvedinaprespecificfunction AT tangyunjia ametaanalysisbasedmethodforprioritizingcandidategenesinvolvedinaprespecificfunction AT yuanhao ametaanalysisbasedmethodforprioritizingcandidategenesinvolvedinaprespecificfunction AT wanglongteng ametaanalysisbasedmethodforprioritizingcandidategenesinvolvedinaprespecificfunction AT shanghaoli ametaanalysisbasedmethodforprioritizingcandidategenesinvolvedinaprespecificfunction AT machuang ametaanalysisbasedmethodforprioritizingcandidategenesinvolvedinaprespecificfunction AT zhaijingjing metaanalysisbasedmethodforprioritizingcandidategenesinvolvedinaprespecificfunction AT tangyunjia metaanalysisbasedmethodforprioritizingcandidategenesinvolvedinaprespecificfunction AT yuanhao metaanalysisbasedmethodforprioritizingcandidategenesinvolvedinaprespecificfunction AT wanglongteng metaanalysisbasedmethodforprioritizingcandidategenesinvolvedinaprespecificfunction AT shanghaoli metaanalysisbasedmethodforprioritizingcandidategenesinvolvedinaprespecificfunction AT machuang metaanalysisbasedmethodforprioritizingcandidategenesinvolvedinaprespecificfunction |