Cargando…
PredCRG: A computational method for recognition of plant circadian genes by employing support vector machine with Laplace kernel
BACKGROUND: Circadian rhythms regulate several physiological and developmental processes of plants. Hence, the identification of genes with the underlying circadian rhythmic features is pivotal. Though computational methods have been developed for the identification of circadian genes, all these met...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8074503/ https://www.ncbi.nlm.nih.gov/pubmed/33902670 http://dx.doi.org/10.1186/s13007-021-00744-3 |
_version_ | 1783684366451343360 |
---|---|
author | Meher, Prabina Kumar Mohapatra, Ansuman Satpathy, Subhrajit Sharma, Anuj Saini, Isha Pradhan, Sukanta Kumar Rai, Anil |
author_facet | Meher, Prabina Kumar Mohapatra, Ansuman Satpathy, Subhrajit Sharma, Anuj Saini, Isha Pradhan, Sukanta Kumar Rai, Anil |
author_sort | Meher, Prabina Kumar |
collection | PubMed |
description | BACKGROUND: Circadian rhythms regulate several physiological and developmental processes of plants. Hence, the identification of genes with the underlying circadian rhythmic features is pivotal. Though computational methods have been developed for the identification of circadian genes, all these methods are based on gene expression datasets. In other words, we failed to search any sequence-based model, and that motivated us to deploy the present computational method to identify the proteins encoded by the circadian genes. RESULTS: Support vector machine (SVM) with seven kernels, i.e., linear, polynomial, radial, sigmoid, hyperbolic, Bessel and Laplace was utilized for prediction by employing compositional, transitional and physico-chemical features. Higher accuracy of 62.48% was achieved with the Laplace kernel, following the fivefold cross- validation approach. The developed model further secured 62.96% accuracy with an independent dataset. The SVM also outperformed other state-of-art machine learning algorithms, i.e., Random Forest, Bagging, AdaBoost, XGBoost and LASSO. We also performed proteome-wide identification of circadian proteins in two cereal crops namely, Oryza sativa and Sorghum bicolor, followed by the functional annotation of the predicted circadian proteins with Gene Ontology (GO) terms. CONCLUSIONS: To the best of our knowledge, this is the first computational method to identify the circadian genes with the sequence data. Based on the proposed method, we have developed an R-package PredCRG (https://cran.r-project.org/web/packages/PredCRG/index.html) for the scientific community for proteome-wide identification of circadian genes. The present study supplements the existing computational methods as well as wet-lab experiments for the recognition of circadian genes. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13007-021-00744-3. |
format | Online Article Text |
id | pubmed-8074503 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-80745032021-04-26 PredCRG: A computational method for recognition of plant circadian genes by employing support vector machine with Laplace kernel Meher, Prabina Kumar Mohapatra, Ansuman Satpathy, Subhrajit Sharma, Anuj Saini, Isha Pradhan, Sukanta Kumar Rai, Anil Plant Methods Methodology BACKGROUND: Circadian rhythms regulate several physiological and developmental processes of plants. Hence, the identification of genes with the underlying circadian rhythmic features is pivotal. Though computational methods have been developed for the identification of circadian genes, all these methods are based on gene expression datasets. In other words, we failed to search any sequence-based model, and that motivated us to deploy the present computational method to identify the proteins encoded by the circadian genes. RESULTS: Support vector machine (SVM) with seven kernels, i.e., linear, polynomial, radial, sigmoid, hyperbolic, Bessel and Laplace was utilized for prediction by employing compositional, transitional and physico-chemical features. Higher accuracy of 62.48% was achieved with the Laplace kernel, following the fivefold cross- validation approach. The developed model further secured 62.96% accuracy with an independent dataset. The SVM also outperformed other state-of-art machine learning algorithms, i.e., Random Forest, Bagging, AdaBoost, XGBoost and LASSO. We also performed proteome-wide identification of circadian proteins in two cereal crops namely, Oryza sativa and Sorghum bicolor, followed by the functional annotation of the predicted circadian proteins with Gene Ontology (GO) terms. CONCLUSIONS: To the best of our knowledge, this is the first computational method to identify the circadian genes with the sequence data. Based on the proposed method, we have developed an R-package PredCRG (https://cran.r-project.org/web/packages/PredCRG/index.html) for the scientific community for proteome-wide identification of circadian genes. The present study supplements the existing computational methods as well as wet-lab experiments for the recognition of circadian genes. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13007-021-00744-3. BioMed Central 2021-04-26 /pmc/articles/PMC8074503/ /pubmed/33902670 http://dx.doi.org/10.1186/s13007-021-00744-3 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Methodology Meher, Prabina Kumar Mohapatra, Ansuman Satpathy, Subhrajit Sharma, Anuj Saini, Isha Pradhan, Sukanta Kumar Rai, Anil PredCRG: A computational method for recognition of plant circadian genes by employing support vector machine with Laplace kernel |
title | PredCRG: A computational method for recognition of plant circadian genes by employing support vector machine with Laplace kernel |
title_full | PredCRG: A computational method for recognition of plant circadian genes by employing support vector machine with Laplace kernel |
title_fullStr | PredCRG: A computational method for recognition of plant circadian genes by employing support vector machine with Laplace kernel |
title_full_unstemmed | PredCRG: A computational method for recognition of plant circadian genes by employing support vector machine with Laplace kernel |
title_short | PredCRG: A computational method for recognition of plant circadian genes by employing support vector machine with Laplace kernel |
title_sort | predcrg: a computational method for recognition of plant circadian genes by employing support vector machine with laplace kernel |
topic | Methodology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8074503/ https://www.ncbi.nlm.nih.gov/pubmed/33902670 http://dx.doi.org/10.1186/s13007-021-00744-3 |
work_keys_str_mv | AT meherprabinakumar predcrgacomputationalmethodforrecognitionofplantcircadiangenesbyemployingsupportvectormachinewithlaplacekernel AT mohapatraansuman predcrgacomputationalmethodforrecognitionofplantcircadiangenesbyemployingsupportvectormachinewithlaplacekernel AT satpathysubhrajit predcrgacomputationalmethodforrecognitionofplantcircadiangenesbyemployingsupportvectormachinewithlaplacekernel AT sharmaanuj predcrgacomputationalmethodforrecognitionofplantcircadiangenesbyemployingsupportvectormachinewithlaplacekernel AT sainiisha predcrgacomputationalmethodforrecognitionofplantcircadiangenesbyemployingsupportvectormachinewithlaplacekernel AT pradhansukantakumar predcrgacomputationalmethodforrecognitionofplantcircadiangenesbyemployingsupportvectormachinewithlaplacekernel AT raianil predcrgacomputationalmethodforrecognitionofplantcircadiangenesbyemployingsupportvectormachinewithlaplacekernel |