Cargando…

RGAugury: a pipeline for genome-wide prediction of resistance gene analogs (RGAs) in plants

BACKGROUND: Resistance gene analogs (RGAs), such as NBS-encoding proteins, receptor-like protein kinases (RLKs) and receptor-like proteins (RLPs), are potential R-genes that contain specific conserved domains and motifs. Thus, RGAs can be predicted based on their conserved structural features using...

Descripción completa

Detalles Bibliográficos
Autores principales:	Li, Pingchuan, Quan, Xiande, Jia, Gaofeng, Xiao, Jin, Cloutier, Sylvie, You, Frank M.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2016
Materias:	Software
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5093994/ https://www.ncbi.nlm.nih.gov/pubmed/27806688 http://dx.doi.org/10.1186/s12864-016-3197-x

_version_	1782465037875544064
author	Li, Pingchuan Quan, Xiande Jia, Gaofeng Xiao, Jin Cloutier, Sylvie You, Frank M.
author_facet	Li, Pingchuan Quan, Xiande Jia, Gaofeng Xiao, Jin Cloutier, Sylvie You, Frank M.
author_sort	Li, Pingchuan
collection	PubMed
description	BACKGROUND: Resistance gene analogs (RGAs), such as NBS-encoding proteins, receptor-like protein kinases (RLKs) and receptor-like proteins (RLPs), are potential R-genes that contain specific conserved domains and motifs. Thus, RGAs can be predicted based on their conserved structural features using bioinformatics tools. Computer programs have been developed for the identification of individual domains and motifs from the protein sequences of RGAs but none offer a systematic assessment of the different types of RGAs. A user-friendly and efficient pipeline is needed for large-scale genome-wide RGA predictions of the growing number of sequenced plant genomes. RESULTS: An integrative pipeline, named RGAugury, was developed to automate RGA prediction. The pipeline first identifies RGA-related protein domains and motifs, namely nucleotide binding site (NB-ARC), leucine rich repeat (LRR), transmembrane (TM), serine/threonine and tyrosine kinase (STTK), lysin motif (LysM), coiled-coil (CC) and Toll/Interleukin-1 receptor (TIR). RGA candidates are identified and classified into four major families based on the presence of combinations of these RGA domains and motifs: NBS-encoding, TM-CC, and membrane associated RLP and RLK. All time-consuming analyses of the pipeline are paralleled to improve performance. The pipeline was evaluated using the well-annotated Arabidopsis genome. A total of 98.5, 85.2, and 100 % of the reported NBS-encoding genes, membrane associated RLPs and RLKs were validated, respectively. The pipeline was also successfully applied to predict RGAs for 50 sequenced plant genomes. A user-friendly web interface was implemented to ease command line operations, facilitate visualization and simplify result management for multiple datasets. CONCLUSIONS: RGAugury is an efficiently integrative bioinformatics tool for large scale genome-wide identification of RGAs. It is freely available at Bitbucket: https://bitbucket.org/yaanlpc/rgaugury. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-3197-x) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-5093994
institution	National Center for Biotechnology Information
language	English
publishDate	2016
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-50939942016-11-07 RGAugury: a pipeline for genome-wide prediction of resistance gene analogs (RGAs) in plants Li, Pingchuan Quan, Xiande Jia, Gaofeng Xiao, Jin Cloutier, Sylvie You, Frank M. BMC Genomics Software BACKGROUND: Resistance gene analogs (RGAs), such as NBS-encoding proteins, receptor-like protein kinases (RLKs) and receptor-like proteins (RLPs), are potential R-genes that contain specific conserved domains and motifs. Thus, RGAs can be predicted based on their conserved structural features using bioinformatics tools. Computer programs have been developed for the identification of individual domains and motifs from the protein sequences of RGAs but none offer a systematic assessment of the different types of RGAs. A user-friendly and efficient pipeline is needed for large-scale genome-wide RGA predictions of the growing number of sequenced plant genomes. RESULTS: An integrative pipeline, named RGAugury, was developed to automate RGA prediction. The pipeline first identifies RGA-related protein domains and motifs, namely nucleotide binding site (NB-ARC), leucine rich repeat (LRR), transmembrane (TM), serine/threonine and tyrosine kinase (STTK), lysin motif (LysM), coiled-coil (CC) and Toll/Interleukin-1 receptor (TIR). RGA candidates are identified and classified into four major families based on the presence of combinations of these RGA domains and motifs: NBS-encoding, TM-CC, and membrane associated RLP and RLK. All time-consuming analyses of the pipeline are paralleled to improve performance. The pipeline was evaluated using the well-annotated Arabidopsis genome. A total of 98.5, 85.2, and 100 % of the reported NBS-encoding genes, membrane associated RLPs and RLKs were validated, respectively. The pipeline was also successfully applied to predict RGAs for 50 sequenced plant genomes. A user-friendly web interface was implemented to ease command line operations, facilitate visualization and simplify result management for multiple datasets. CONCLUSIONS: RGAugury is an efficiently integrative bioinformatics tool for large scale genome-wide identification of RGAs. It is freely available at Bitbucket: https://bitbucket.org/yaanlpc/rgaugury. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-3197-x) contains supplementary material, which is available to authorized users. BioMed Central 2016-11-02 /pmc/articles/PMC5093994/ /pubmed/27806688 http://dx.doi.org/10.1186/s12864-016-3197-x Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Software Li, Pingchuan Quan, Xiande Jia, Gaofeng Xiao, Jin Cloutier, Sylvie You, Frank M. RGAugury: a pipeline for genome-wide prediction of resistance gene analogs (RGAs) in plants
title	RGAugury: a pipeline for genome-wide prediction of resistance gene analogs (RGAs) in plants
title_full	RGAugury: a pipeline for genome-wide prediction of resistance gene analogs (RGAs) in plants
title_fullStr	RGAugury: a pipeline for genome-wide prediction of resistance gene analogs (RGAs) in plants
title_full_unstemmed	RGAugury: a pipeline for genome-wide prediction of resistance gene analogs (RGAs) in plants
title_short	RGAugury: a pipeline for genome-wide prediction of resistance gene analogs (RGAs) in plants
title_sort	rgaugury: a pipeline for genome-wide prediction of resistance gene analogs (rgas) in plants
topic	Software
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5093994/ https://www.ncbi.nlm.nih.gov/pubmed/27806688 http://dx.doi.org/10.1186/s12864-016-3197-x
work_keys_str_mv	AT lipingchuan rgauguryapipelineforgenomewidepredictionofresistancegeneanalogsrgasinplants AT quanxiande rgauguryapipelineforgenomewidepredictionofresistancegeneanalogsrgasinplants AT jiagaofeng rgauguryapipelineforgenomewidepredictionofresistancegeneanalogsrgasinplants AT xiaojin rgauguryapipelineforgenomewidepredictionofresistancegeneanalogsrgasinplants AT cloutiersylvie rgauguryapipelineforgenomewidepredictionofresistancegeneanalogsrgasinplants AT youfrankm rgauguryapipelineforgenomewidepredictionofresistancegeneanalogsrgasinplants

RGAugury: a pipeline for genome-wide prediction of resistance gene analogs (RGAs) in plants

Ejemplares similares