Cargando…

A Meta-Analysis Based Method for Prioritizing Candidate Genes Involved in a Pre-specific Function

The identification of genes associated with a given biological function in plants remains a challenge, although network-based gene prioritization algorithms have been developed for Arabidopsis thaliana and many non-model plant species. Nevertheless, these network-based gene prioritization algorithms...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhai, Jingjing, Tang, Yunjia, Yuan, Hao, Wang, Longteng, Shang, Haoli, Ma, Chuang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5156684/
https://www.ncbi.nlm.nih.gov/pubmed/28018423
http://dx.doi.org/10.3389/fpls.2016.01914
_version_ 1782481301126774784
author Zhai, Jingjing
Tang, Yunjia
Yuan, Hao
Wang, Longteng
Shang, Haoli
Ma, Chuang
author_facet Zhai, Jingjing
Tang, Yunjia
Yuan, Hao
Wang, Longteng
Shang, Haoli
Ma, Chuang
author_sort Zhai, Jingjing
collection PubMed
description The identification of genes associated with a given biological function in plants remains a challenge, although network-based gene prioritization algorithms have been developed for Arabidopsis thaliana and many non-model plant species. Nevertheless, these network-based gene prioritization algorithms have encountered several problems; one in particular is that of unsatisfactory prediction accuracy due to limited network coverage, varying link quality, and/or uncertain network connectivity. Thus, a model that integrates complementary biological data may be expected to increase the prediction accuracy of gene prioritization. Toward this goal, we developed a novel gene prioritization method named RafSee, to rank candidate genes using a random forest algorithm that integrates sequence, evolutionary, and epigenetic features of plants. Subsequently, we proposed an integrative approach named RAP (Rank Aggregation-based data fusion for gene Prioritization), in which an order statistics-based meta-analysis was used to aggregate the rank of the network-based gene prioritization method and RafSee, for accurately prioritizing candidate genes involved in a pre-specific biological function. Finally, we showcased the utility of RAP by prioritizing 380 flowering-time genes in Arabidopsis. The “leave-one-out” cross-validation experiment showed that RafSee could work as a complement to a current state-of-art network-based gene prioritization system (AraNet v2). Moreover, RAP ranked 53.68% (204/380) flowering-time genes higher than AraNet v2, resulting in an 39.46% improvement in term of the first quartile rank. Further evaluations also showed that RAP was effective in prioritizing genes-related to different abiotic stresses. To enhance the usability of RAP for Arabidopsis and non-model plant species, an R package implementing the method is freely available at http://bioinfo.nwafu.edu.cn/software.
format Online
Article
Text
id pubmed-5156684
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-51566842016-12-23 A Meta-Analysis Based Method for Prioritizing Candidate Genes Involved in a Pre-specific Function Zhai, Jingjing Tang, Yunjia Yuan, Hao Wang, Longteng Shang, Haoli Ma, Chuang Front Plant Sci Plant Science The identification of genes associated with a given biological function in plants remains a challenge, although network-based gene prioritization algorithms have been developed for Arabidopsis thaliana and many non-model plant species. Nevertheless, these network-based gene prioritization algorithms have encountered several problems; one in particular is that of unsatisfactory prediction accuracy due to limited network coverage, varying link quality, and/or uncertain network connectivity. Thus, a model that integrates complementary biological data may be expected to increase the prediction accuracy of gene prioritization. Toward this goal, we developed a novel gene prioritization method named RafSee, to rank candidate genes using a random forest algorithm that integrates sequence, evolutionary, and epigenetic features of plants. Subsequently, we proposed an integrative approach named RAP (Rank Aggregation-based data fusion for gene Prioritization), in which an order statistics-based meta-analysis was used to aggregate the rank of the network-based gene prioritization method and RafSee, for accurately prioritizing candidate genes involved in a pre-specific biological function. Finally, we showcased the utility of RAP by prioritizing 380 flowering-time genes in Arabidopsis. The “leave-one-out” cross-validation experiment showed that RafSee could work as a complement to a current state-of-art network-based gene prioritization system (AraNet v2). Moreover, RAP ranked 53.68% (204/380) flowering-time genes higher than AraNet v2, resulting in an 39.46% improvement in term of the first quartile rank. Further evaluations also showed that RAP was effective in prioritizing genes-related to different abiotic stresses. To enhance the usability of RAP for Arabidopsis and non-model plant species, an R package implementing the method is freely available at http://bioinfo.nwafu.edu.cn/software. Frontiers Media S.A. 2016-12-15 /pmc/articles/PMC5156684/ /pubmed/28018423 http://dx.doi.org/10.3389/fpls.2016.01914 Text en Copyright © 2016 Zhai, Tang, Yuan, Wang, Shang and Ma. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Plant Science
Zhai, Jingjing
Tang, Yunjia
Yuan, Hao
Wang, Longteng
Shang, Haoli
Ma, Chuang
A Meta-Analysis Based Method for Prioritizing Candidate Genes Involved in a Pre-specific Function
title A Meta-Analysis Based Method for Prioritizing Candidate Genes Involved in a Pre-specific Function
title_full A Meta-Analysis Based Method for Prioritizing Candidate Genes Involved in a Pre-specific Function
title_fullStr A Meta-Analysis Based Method for Prioritizing Candidate Genes Involved in a Pre-specific Function
title_full_unstemmed A Meta-Analysis Based Method for Prioritizing Candidate Genes Involved in a Pre-specific Function
title_short A Meta-Analysis Based Method for Prioritizing Candidate Genes Involved in a Pre-specific Function
title_sort meta-analysis based method for prioritizing candidate genes involved in a pre-specific function
topic Plant Science
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5156684/
https://www.ncbi.nlm.nih.gov/pubmed/28018423
http://dx.doi.org/10.3389/fpls.2016.01914
work_keys_str_mv AT zhaijingjing ametaanalysisbasedmethodforprioritizingcandidategenesinvolvedinaprespecificfunction
AT tangyunjia ametaanalysisbasedmethodforprioritizingcandidategenesinvolvedinaprespecificfunction
AT yuanhao ametaanalysisbasedmethodforprioritizingcandidategenesinvolvedinaprespecificfunction
AT wanglongteng ametaanalysisbasedmethodforprioritizingcandidategenesinvolvedinaprespecificfunction
AT shanghaoli ametaanalysisbasedmethodforprioritizingcandidategenesinvolvedinaprespecificfunction
AT machuang ametaanalysisbasedmethodforprioritizingcandidategenesinvolvedinaprespecificfunction
AT zhaijingjing metaanalysisbasedmethodforprioritizingcandidategenesinvolvedinaprespecificfunction
AT tangyunjia metaanalysisbasedmethodforprioritizingcandidategenesinvolvedinaprespecificfunction
AT yuanhao metaanalysisbasedmethodforprioritizingcandidategenesinvolvedinaprespecificfunction
AT wanglongteng metaanalysisbasedmethodforprioritizingcandidategenesinvolvedinaprespecificfunction
AT shanghaoli metaanalysisbasedmethodforprioritizingcandidategenesinvolvedinaprespecificfunction
AT machuang metaanalysisbasedmethodforprioritizingcandidategenesinvolvedinaprespecificfunction