Cargando…

iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization

Sequence-based analysis and prediction are fundamental bioinformatic tasks that facilitate understanding of the sequence(-structure)-function paradigm for DNAs, RNAs and proteins. Rapid accumulation of sequences requires equally pervasive development of new predictive models, which depends on the av...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Zhen, Zhao, Pei, Li, Chen, Li, Fuyi, Xiang, Dongxu, Chen, Yong-Zi, Akutsu, Tatsuya, Daly, Roger J, Webb, Geoffrey I, Zhao, Quanzhi, Kurgan, Lukasz, Song, Jiangning
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8191785/
https://www.ncbi.nlm.nih.gov/pubmed/33660783
http://dx.doi.org/10.1093/nar/gkab122
_version_ 1783705929043148800
author Chen, Zhen
Zhao, Pei
Li, Chen
Li, Fuyi
Xiang, Dongxu
Chen, Yong-Zi
Akutsu, Tatsuya
Daly, Roger J
Webb, Geoffrey I
Zhao, Quanzhi
Kurgan, Lukasz
Song, Jiangning
author_facet Chen, Zhen
Zhao, Pei
Li, Chen
Li, Fuyi
Xiang, Dongxu
Chen, Yong-Zi
Akutsu, Tatsuya
Daly, Roger J
Webb, Geoffrey I
Zhao, Quanzhi
Kurgan, Lukasz
Song, Jiangning
author_sort Chen, Zhen
collection PubMed
description Sequence-based analysis and prediction are fundamental bioinformatic tasks that facilitate understanding of the sequence(-structure)-function paradigm for DNAs, RNAs and proteins. Rapid accumulation of sequences requires equally pervasive development of new predictive models, which depends on the availability of effective tools that support these efforts. We introduce iLearnPlus, the first machine-learning platform with graphical- and web-based interfaces for the construction of machine-learning pipelines for analysis and predictions using nucleic acid and protein sequences. iLearnPlus provides a comprehensive set of algorithms and automates sequence-based feature extraction and analysis, construction and deployment of models, assessment of predictive performance, statistical analysis, and data visualization; all without programming. iLearnPlus includes a wide range of feature sets which encode information from the input sequences and over twenty machine-learning algorithms that cover several deep-learning approaches, outnumbering the current solutions by a wide margin. Our solution caters to experienced bioinformaticians, given the broad range of options, and biologists with no programming background, given the point-and-click interface and easy-to-follow design process. We showcase iLearnPlus with two case studies concerning prediction of long noncoding RNAs (lncRNAs) from RNA transcripts and prediction of crotonylation sites in protein chains. iLearnPlus is an open-source platform available at https://github.com/Superzchen/iLearnPlus/ with the webserver at http://ilearnplus.erc.monash.edu/.
format Online
Article
Text
id pubmed-8191785
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-81917852021-06-11 iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization Chen, Zhen Zhao, Pei Li, Chen Li, Fuyi Xiang, Dongxu Chen, Yong-Zi Akutsu, Tatsuya Daly, Roger J Webb, Geoffrey I Zhao, Quanzhi Kurgan, Lukasz Song, Jiangning Nucleic Acids Res Methods Online Sequence-based analysis and prediction are fundamental bioinformatic tasks that facilitate understanding of the sequence(-structure)-function paradigm for DNAs, RNAs and proteins. Rapid accumulation of sequences requires equally pervasive development of new predictive models, which depends on the availability of effective tools that support these efforts. We introduce iLearnPlus, the first machine-learning platform with graphical- and web-based interfaces for the construction of machine-learning pipelines for analysis and predictions using nucleic acid and protein sequences. iLearnPlus provides a comprehensive set of algorithms and automates sequence-based feature extraction and analysis, construction and deployment of models, assessment of predictive performance, statistical analysis, and data visualization; all without programming. iLearnPlus includes a wide range of feature sets which encode information from the input sequences and over twenty machine-learning algorithms that cover several deep-learning approaches, outnumbering the current solutions by a wide margin. Our solution caters to experienced bioinformaticians, given the broad range of options, and biologists with no programming background, given the point-and-click interface and easy-to-follow design process. We showcase iLearnPlus with two case studies concerning prediction of long noncoding RNAs (lncRNAs) from RNA transcripts and prediction of crotonylation sites in protein chains. iLearnPlus is an open-source platform available at https://github.com/Superzchen/iLearnPlus/ with the webserver at http://ilearnplus.erc.monash.edu/. Oxford University Press 2021-02-28 /pmc/articles/PMC8191785/ /pubmed/33660783 http://dx.doi.org/10.1093/nar/gkab122 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of Nucleic Acids Research. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods Online
Chen, Zhen
Zhao, Pei
Li, Chen
Li, Fuyi
Xiang, Dongxu
Chen, Yong-Zi
Akutsu, Tatsuya
Daly, Roger J
Webb, Geoffrey I
Zhao, Quanzhi
Kurgan, Lukasz
Song, Jiangning
iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization
title iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization
title_full iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization
title_fullStr iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization
title_full_unstemmed iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization
title_short iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization
title_sort ilearnplus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8191785/
https://www.ncbi.nlm.nih.gov/pubmed/33660783
http://dx.doi.org/10.1093/nar/gkab122
work_keys_str_mv AT chenzhen ilearnplusacomprehensiveandautomatedmachinelearningplatformfornucleicacidandproteinsequenceanalysispredictionandvisualization
AT zhaopei ilearnplusacomprehensiveandautomatedmachinelearningplatformfornucleicacidandproteinsequenceanalysispredictionandvisualization
AT lichen ilearnplusacomprehensiveandautomatedmachinelearningplatformfornucleicacidandproteinsequenceanalysispredictionandvisualization
AT lifuyi ilearnplusacomprehensiveandautomatedmachinelearningplatformfornucleicacidandproteinsequenceanalysispredictionandvisualization
AT xiangdongxu ilearnplusacomprehensiveandautomatedmachinelearningplatformfornucleicacidandproteinsequenceanalysispredictionandvisualization
AT chenyongzi ilearnplusacomprehensiveandautomatedmachinelearningplatformfornucleicacidandproteinsequenceanalysispredictionandvisualization
AT akutsutatsuya ilearnplusacomprehensiveandautomatedmachinelearningplatformfornucleicacidandproteinsequenceanalysispredictionandvisualization
AT dalyrogerj ilearnplusacomprehensiveandautomatedmachinelearningplatformfornucleicacidandproteinsequenceanalysispredictionandvisualization
AT webbgeoffreyi ilearnplusacomprehensiveandautomatedmachinelearningplatformfornucleicacidandproteinsequenceanalysispredictionandvisualization
AT zhaoquanzhi ilearnplusacomprehensiveandautomatedmachinelearningplatformfornucleicacidandproteinsequenceanalysispredictionandvisualization
AT kurganlukasz ilearnplusacomprehensiveandautomatedmachinelearningplatformfornucleicacidandproteinsequenceanalysispredictionandvisualization
AT songjiangning ilearnplusacomprehensiveandautomatedmachinelearningplatformfornucleicacidandproteinsequenceanalysispredictionandvisualization