Cargando…
Predicting Long non-coding RNAs through feature ensemble learning
BACKGROUND: Many transcripts have been generated due to the development of sequencing technologies, and lncRNA is an important type of transcript. Predicting lncRNAs from transcripts is a challenging and important task. Traditional experimental lncRNA prediction methods are time-consuming and labor-...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7745355/ https://www.ncbi.nlm.nih.gov/pubmed/33334320 http://dx.doi.org/10.1186/s12864-020-07237-y |
_version_ | 1783624589871415296 |
---|---|
author | Xu, Yanzhen Zhao, Xiaohan Liu, Shuai Zhang, Wen |
author_facet | Xu, Yanzhen Zhao, Xiaohan Liu, Shuai Zhang, Wen |
author_sort | Xu, Yanzhen |
collection | PubMed |
description | BACKGROUND: Many transcripts have been generated due to the development of sequencing technologies, and lncRNA is an important type of transcript. Predicting lncRNAs from transcripts is a challenging and important task. Traditional experimental lncRNA prediction methods are time-consuming and labor-intensive. Efficient computational methods for lncRNA prediction are in demand. RESULTS: In this paper, we propose two lncRNA prediction methods based on feature ensemble learning strategies named LncPred-IEL and LncPred-ANEL. Specifically, we encode sequences into six different types of features including transcript-specified features and general sequence-derived features. Then we consider two feature ensemble strategies to utilize and integrate the information in different feature types, the iterative ensemble learning (IEL) and the attention network ensemble learning (ANEL). IEL employs a supervised iterative way to ensemble base predictors built on six different types of features. ANEL introduces an attention mechanism-based deep learning model to ensemble features by adaptively learning the weight of individual feature types. Experiments demonstrate that both LncPred-IEL and LncPred-ANEL can effectively separate lncRNAs and other transcripts in feature space. Moreover, comparison experiments demonstrate that LncPred-IEL and LncPred-ANEL outperform several state-of-the-art methods when evaluated by 5-fold cross-validation. Both methods have good performances in cross-species lncRNA prediction. CONCLUSIONS: LncPred-IEL and LncPred-ANEL are promising lncRNA prediction tools that can effectively utilize and integrate the information in different types of features. |
format | Online Article Text |
id | pubmed-7745355 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-77453552020-12-18 Predicting Long non-coding RNAs through feature ensemble learning Xu, Yanzhen Zhao, Xiaohan Liu, Shuai Zhang, Wen BMC Genomics Research BACKGROUND: Many transcripts have been generated due to the development of sequencing technologies, and lncRNA is an important type of transcript. Predicting lncRNAs from transcripts is a challenging and important task. Traditional experimental lncRNA prediction methods are time-consuming and labor-intensive. Efficient computational methods for lncRNA prediction are in demand. RESULTS: In this paper, we propose two lncRNA prediction methods based on feature ensemble learning strategies named LncPred-IEL and LncPred-ANEL. Specifically, we encode sequences into six different types of features including transcript-specified features and general sequence-derived features. Then we consider two feature ensemble strategies to utilize and integrate the information in different feature types, the iterative ensemble learning (IEL) and the attention network ensemble learning (ANEL). IEL employs a supervised iterative way to ensemble base predictors built on six different types of features. ANEL introduces an attention mechanism-based deep learning model to ensemble features by adaptively learning the weight of individual feature types. Experiments demonstrate that both LncPred-IEL and LncPred-ANEL can effectively separate lncRNAs and other transcripts in feature space. Moreover, comparison experiments demonstrate that LncPred-IEL and LncPred-ANEL outperform several state-of-the-art methods when evaluated by 5-fold cross-validation. Both methods have good performances in cross-species lncRNA prediction. CONCLUSIONS: LncPred-IEL and LncPred-ANEL are promising lncRNA prediction tools that can effectively utilize and integrate the information in different types of features. BioMed Central 2020-12-17 /pmc/articles/PMC7745355/ /pubmed/33334320 http://dx.doi.org/10.1186/s12864-020-07237-y Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Xu, Yanzhen Zhao, Xiaohan Liu, Shuai Zhang, Wen Predicting Long non-coding RNAs through feature ensemble learning |
title | Predicting Long non-coding RNAs through feature ensemble learning |
title_full | Predicting Long non-coding RNAs through feature ensemble learning |
title_fullStr | Predicting Long non-coding RNAs through feature ensemble learning |
title_full_unstemmed | Predicting Long non-coding RNAs through feature ensemble learning |
title_short | Predicting Long non-coding RNAs through feature ensemble learning |
title_sort | predicting long non-coding rnas through feature ensemble learning |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7745355/ https://www.ncbi.nlm.nih.gov/pubmed/33334320 http://dx.doi.org/10.1186/s12864-020-07237-y |
work_keys_str_mv | AT xuyanzhen predictinglongnoncodingrnasthroughfeatureensemblelearning AT zhaoxiaohan predictinglongnoncodingrnasthroughfeatureensemblelearning AT liushuai predictinglongnoncodingrnasthroughfeatureensemblelearning AT zhangwen predictinglongnoncodingrnasthroughfeatureensemblelearning |