Cargando…
iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization
Sequence-based analysis and prediction are fundamental bioinformatic tasks that facilitate understanding of the sequence(-structure)-function paradigm for DNAs, RNAs and proteins. Rapid accumulation of sequences requires equally pervasive development of new predictive models, which depends on the av...
Autores principales: | , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8191785/ https://www.ncbi.nlm.nih.gov/pubmed/33660783 http://dx.doi.org/10.1093/nar/gkab122 |
_version_ | 1783705929043148800 |
---|---|
author | Chen, Zhen Zhao, Pei Li, Chen Li, Fuyi Xiang, Dongxu Chen, Yong-Zi Akutsu, Tatsuya Daly, Roger J Webb, Geoffrey I Zhao, Quanzhi Kurgan, Lukasz Song, Jiangning |
author_facet | Chen, Zhen Zhao, Pei Li, Chen Li, Fuyi Xiang, Dongxu Chen, Yong-Zi Akutsu, Tatsuya Daly, Roger J Webb, Geoffrey I Zhao, Quanzhi Kurgan, Lukasz Song, Jiangning |
author_sort | Chen, Zhen |
collection | PubMed |
description | Sequence-based analysis and prediction are fundamental bioinformatic tasks that facilitate understanding of the sequence(-structure)-function paradigm for DNAs, RNAs and proteins. Rapid accumulation of sequences requires equally pervasive development of new predictive models, which depends on the availability of effective tools that support these efforts. We introduce iLearnPlus, the first machine-learning platform with graphical- and web-based interfaces for the construction of machine-learning pipelines for analysis and predictions using nucleic acid and protein sequences. iLearnPlus provides a comprehensive set of algorithms and automates sequence-based feature extraction and analysis, construction and deployment of models, assessment of predictive performance, statistical analysis, and data visualization; all without programming. iLearnPlus includes a wide range of feature sets which encode information from the input sequences and over twenty machine-learning algorithms that cover several deep-learning approaches, outnumbering the current solutions by a wide margin. Our solution caters to experienced bioinformaticians, given the broad range of options, and biologists with no programming background, given the point-and-click interface and easy-to-follow design process. We showcase iLearnPlus with two case studies concerning prediction of long noncoding RNAs (lncRNAs) from RNA transcripts and prediction of crotonylation sites in protein chains. iLearnPlus is an open-source platform available at https://github.com/Superzchen/iLearnPlus/ with the webserver at http://ilearnplus.erc.monash.edu/. |
format | Online Article Text |
id | pubmed-8191785 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-81917852021-06-11 iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization Chen, Zhen Zhao, Pei Li, Chen Li, Fuyi Xiang, Dongxu Chen, Yong-Zi Akutsu, Tatsuya Daly, Roger J Webb, Geoffrey I Zhao, Quanzhi Kurgan, Lukasz Song, Jiangning Nucleic Acids Res Methods Online Sequence-based analysis and prediction are fundamental bioinformatic tasks that facilitate understanding of the sequence(-structure)-function paradigm for DNAs, RNAs and proteins. Rapid accumulation of sequences requires equally pervasive development of new predictive models, which depends on the availability of effective tools that support these efforts. We introduce iLearnPlus, the first machine-learning platform with graphical- and web-based interfaces for the construction of machine-learning pipelines for analysis and predictions using nucleic acid and protein sequences. iLearnPlus provides a comprehensive set of algorithms and automates sequence-based feature extraction and analysis, construction and deployment of models, assessment of predictive performance, statistical analysis, and data visualization; all without programming. iLearnPlus includes a wide range of feature sets which encode information from the input sequences and over twenty machine-learning algorithms that cover several deep-learning approaches, outnumbering the current solutions by a wide margin. Our solution caters to experienced bioinformaticians, given the broad range of options, and biologists with no programming background, given the point-and-click interface and easy-to-follow design process. We showcase iLearnPlus with two case studies concerning prediction of long noncoding RNAs (lncRNAs) from RNA transcripts and prediction of crotonylation sites in protein chains. iLearnPlus is an open-source platform available at https://github.com/Superzchen/iLearnPlus/ with the webserver at http://ilearnplus.erc.monash.edu/. Oxford University Press 2021-02-28 /pmc/articles/PMC8191785/ /pubmed/33660783 http://dx.doi.org/10.1093/nar/gkab122 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of Nucleic Acids Research. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Methods Online Chen, Zhen Zhao, Pei Li, Chen Li, Fuyi Xiang, Dongxu Chen, Yong-Zi Akutsu, Tatsuya Daly, Roger J Webb, Geoffrey I Zhao, Quanzhi Kurgan, Lukasz Song, Jiangning iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization |
title |
iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization |
title_full |
iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization |
title_fullStr |
iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization |
title_full_unstemmed |
iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization |
title_short |
iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization |
title_sort | ilearnplus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization |
topic | Methods Online |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8191785/ https://www.ncbi.nlm.nih.gov/pubmed/33660783 http://dx.doi.org/10.1093/nar/gkab122 |
work_keys_str_mv | AT chenzhen ilearnplusacomprehensiveandautomatedmachinelearningplatformfornucleicacidandproteinsequenceanalysispredictionandvisualization AT zhaopei ilearnplusacomprehensiveandautomatedmachinelearningplatformfornucleicacidandproteinsequenceanalysispredictionandvisualization AT lichen ilearnplusacomprehensiveandautomatedmachinelearningplatformfornucleicacidandproteinsequenceanalysispredictionandvisualization AT lifuyi ilearnplusacomprehensiveandautomatedmachinelearningplatformfornucleicacidandproteinsequenceanalysispredictionandvisualization AT xiangdongxu ilearnplusacomprehensiveandautomatedmachinelearningplatformfornucleicacidandproteinsequenceanalysispredictionandvisualization AT chenyongzi ilearnplusacomprehensiveandautomatedmachinelearningplatformfornucleicacidandproteinsequenceanalysispredictionandvisualization AT akutsutatsuya ilearnplusacomprehensiveandautomatedmachinelearningplatformfornucleicacidandproteinsequenceanalysispredictionandvisualization AT dalyrogerj ilearnplusacomprehensiveandautomatedmachinelearningplatformfornucleicacidandproteinsequenceanalysispredictionandvisualization AT webbgeoffreyi ilearnplusacomprehensiveandautomatedmachinelearningplatformfornucleicacidandproteinsequenceanalysispredictionandvisualization AT zhaoquanzhi ilearnplusacomprehensiveandautomatedmachinelearningplatformfornucleicacidandproteinsequenceanalysispredictionandvisualization AT kurganlukasz ilearnplusacomprehensiveandautomatedmachinelearningplatformfornucleicacidandproteinsequenceanalysispredictionandvisualization AT songjiangning ilearnplusacomprehensiveandautomatedmachinelearningplatformfornucleicacidandproteinsequenceanalysispredictionandvisualization |