Cargando…
Computational prediction of disease related lncRNAs using machine learning
Long non-coding RNAs (lncRNAs), which were once considered as transcriptional noise, are now in the limelight of current research. LncRNAs play a major role in regulating various biological processes such as imprinting, cell differentiation, and splicing. The mutations of lncRNAs are involved in var...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9842610/ https://www.ncbi.nlm.nih.gov/pubmed/36646775 http://dx.doi.org/10.1038/s41598-023-27680-7 |
_version_ | 1784870173891624960 |
---|---|
author | Khalid, Razia Naveed, Hammad Khalid, Zoya |
author_facet | Khalid, Razia Naveed, Hammad Khalid, Zoya |
author_sort | Khalid, Razia |
collection | PubMed |
description | Long non-coding RNAs (lncRNAs), which were once considered as transcriptional noise, are now in the limelight of current research. LncRNAs play a major role in regulating various biological processes such as imprinting, cell differentiation, and splicing. The mutations of lncRNAs are involved in various complex diseases. Identifying lncRNA-disease associations has gained a lot of attention as predicting it efficiently will lead towards better disease treatment. In this study, we have developed a machine learning model that predicts disease-related lncRNAs by combining sequence and structure-based features. The features were trained on SVM and Random Forest classifiers. We have compared our method with the state-of-the-art and obtained the highest F1 score of 76% on SVM classifier. Moreover, this study has overcome two serious limitations of the reported method which are lack of redundancy checking and implementation of oversampling for balancing the positive and negative class. Our method has achieved improved performance among machine learning models reported for lncRNA-disease associations. Combining multiple features together specifically lncRNAs sequence mutation has a significant contribution to the disease related lncRNA prediction. |
format | Online Article Text |
id | pubmed-9842610 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-98426102023-01-18 Computational prediction of disease related lncRNAs using machine learning Khalid, Razia Naveed, Hammad Khalid, Zoya Sci Rep Article Long non-coding RNAs (lncRNAs), which were once considered as transcriptional noise, are now in the limelight of current research. LncRNAs play a major role in regulating various biological processes such as imprinting, cell differentiation, and splicing. The mutations of lncRNAs are involved in various complex diseases. Identifying lncRNA-disease associations has gained a lot of attention as predicting it efficiently will lead towards better disease treatment. In this study, we have developed a machine learning model that predicts disease-related lncRNAs by combining sequence and structure-based features. The features were trained on SVM and Random Forest classifiers. We have compared our method with the state-of-the-art and obtained the highest F1 score of 76% on SVM classifier. Moreover, this study has overcome two serious limitations of the reported method which are lack of redundancy checking and implementation of oversampling for balancing the positive and negative class. Our method has achieved improved performance among machine learning models reported for lncRNA-disease associations. Combining multiple features together specifically lncRNAs sequence mutation has a significant contribution to the disease related lncRNA prediction. Nature Publishing Group UK 2023-01-16 /pmc/articles/PMC9842610/ /pubmed/36646775 http://dx.doi.org/10.1038/s41598-023-27680-7 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Khalid, Razia Naveed, Hammad Khalid, Zoya Computational prediction of disease related lncRNAs using machine learning |
title | Computational prediction of disease related lncRNAs using machine learning |
title_full | Computational prediction of disease related lncRNAs using machine learning |
title_fullStr | Computational prediction of disease related lncRNAs using machine learning |
title_full_unstemmed | Computational prediction of disease related lncRNAs using machine learning |
title_short | Computational prediction of disease related lncRNAs using machine learning |
title_sort | computational prediction of disease related lncrnas using machine learning |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9842610/ https://www.ncbi.nlm.nih.gov/pubmed/36646775 http://dx.doi.org/10.1038/s41598-023-27680-7 |
work_keys_str_mv | AT khalidrazia computationalpredictionofdiseaserelatedlncrnasusingmachinelearning AT naveedhammad computationalpredictionofdiseaserelatedlncrnasusingmachinelearning AT khalidzoya computationalpredictionofdiseaserelatedlncrnasusingmachinelearning |