Cargando…
Integrating multiple microarray dataset analysis and machine learning methods to reveal the key genes and regulatory mechanisms underlying human intervertebral disc degeneration
Intervertebral disc degeneration (IDD), a major cause of lower back pain, has multiple contributing factors including genetics, environment, age, and loading history. Bioinformatics analysis has been extensively used to identify diagnostic biomarkers and therapeutic targets for IDD diagnosis and tre...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7566771/ https://www.ncbi.nlm.nih.gov/pubmed/33083145 http://dx.doi.org/10.7717/peerj.10120 |
_version_ | 1783596193313456128 |
---|---|
author | Chang, Hongze Yang, Xiaolong You, Kemin Jiang, Mingwei Cai, Feng Zhang, Yan Liu, Liang Liu, Hui Liu, Xiaodong |
author_facet | Chang, Hongze Yang, Xiaolong You, Kemin Jiang, Mingwei Cai, Feng Zhang, Yan Liu, Liang Liu, Hui Liu, Xiaodong |
author_sort | Chang, Hongze |
collection | PubMed |
description | Intervertebral disc degeneration (IDD), a major cause of lower back pain, has multiple contributing factors including genetics, environment, age, and loading history. Bioinformatics analysis has been extensively used to identify diagnostic biomarkers and therapeutic targets for IDD diagnosis and treatment. However, multiple microarray dataset analysis and machine learning methods have not been integrated. In this study, we downloaded the mRNA, microRNA (miRNA), long noncoding RNA (lncRNA), and circular RNA (circRNA) expression profiles (GSE34095, GSE15227, GSE63492 GSE116726, GSE56081 and GSE67566) associated with IDD from the GEO database. Using differential expression analysis and recursive feature elimination, we extracted four optimal feature genes. We then used the support vector machine (SVM) to make a classification model with the four optimal feature genes. The ROC curve was used to evaluate the model’s performance, and the expression profiles (GSE63492, GSE116726, GSE56081, and GSE67566) were used to construct a competitive endogenous RNA (ceRNA) regulatory network and explore the underlying mechanisms of the feature genes. We found that three miRNAs (hsa-miR-4728-5p, hsa-miR-5196-5p, and hsa-miR-185-5p) and three circRNAs (hsa_circRNA_100723, hsa_circRNA_104471, and hsa_circRNA_100750) were important regulators with more interactions than the other RNAs across the whole network. The expression level analysis of the three datasets revealed that BCAS4 and SCRG1 were key genes involved in IDD development. Ultimately, our study proposes a novel approach to determining reliable and effective targets in IDD diagnosis and treatment. |
format | Online Article Text |
id | pubmed-7566771 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-75667712020-10-19 Integrating multiple microarray dataset analysis and machine learning methods to reveal the key genes and regulatory mechanisms underlying human intervertebral disc degeneration Chang, Hongze Yang, Xiaolong You, Kemin Jiang, Mingwei Cai, Feng Zhang, Yan Liu, Liang Liu, Hui Liu, Xiaodong PeerJ Bioinformatics Intervertebral disc degeneration (IDD), a major cause of lower back pain, has multiple contributing factors including genetics, environment, age, and loading history. Bioinformatics analysis has been extensively used to identify diagnostic biomarkers and therapeutic targets for IDD diagnosis and treatment. However, multiple microarray dataset analysis and machine learning methods have not been integrated. In this study, we downloaded the mRNA, microRNA (miRNA), long noncoding RNA (lncRNA), and circular RNA (circRNA) expression profiles (GSE34095, GSE15227, GSE63492 GSE116726, GSE56081 and GSE67566) associated with IDD from the GEO database. Using differential expression analysis and recursive feature elimination, we extracted four optimal feature genes. We then used the support vector machine (SVM) to make a classification model with the four optimal feature genes. The ROC curve was used to evaluate the model’s performance, and the expression profiles (GSE63492, GSE116726, GSE56081, and GSE67566) were used to construct a competitive endogenous RNA (ceRNA) regulatory network and explore the underlying mechanisms of the feature genes. We found that three miRNAs (hsa-miR-4728-5p, hsa-miR-5196-5p, and hsa-miR-185-5p) and three circRNAs (hsa_circRNA_100723, hsa_circRNA_104471, and hsa_circRNA_100750) were important regulators with more interactions than the other RNAs across the whole network. The expression level analysis of the three datasets revealed that BCAS4 and SCRG1 were key genes involved in IDD development. Ultimately, our study proposes a novel approach to determining reliable and effective targets in IDD diagnosis and treatment. PeerJ Inc. 2020-10-13 /pmc/articles/PMC7566771/ /pubmed/33083145 http://dx.doi.org/10.7717/peerj.10120 Text en ©2020 Chang et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited. |
spellingShingle | Bioinformatics Chang, Hongze Yang, Xiaolong You, Kemin Jiang, Mingwei Cai, Feng Zhang, Yan Liu, Liang Liu, Hui Liu, Xiaodong Integrating multiple microarray dataset analysis and machine learning methods to reveal the key genes and regulatory mechanisms underlying human intervertebral disc degeneration |
title | Integrating multiple microarray dataset analysis and machine learning methods to reveal the key genes and regulatory mechanisms underlying human intervertebral disc degeneration |
title_full | Integrating multiple microarray dataset analysis and machine learning methods to reveal the key genes and regulatory mechanisms underlying human intervertebral disc degeneration |
title_fullStr | Integrating multiple microarray dataset analysis and machine learning methods to reveal the key genes and regulatory mechanisms underlying human intervertebral disc degeneration |
title_full_unstemmed | Integrating multiple microarray dataset analysis and machine learning methods to reveal the key genes and regulatory mechanisms underlying human intervertebral disc degeneration |
title_short | Integrating multiple microarray dataset analysis and machine learning methods to reveal the key genes and regulatory mechanisms underlying human intervertebral disc degeneration |
title_sort | integrating multiple microarray dataset analysis and machine learning methods to reveal the key genes and regulatory mechanisms underlying human intervertebral disc degeneration |
topic | Bioinformatics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7566771/ https://www.ncbi.nlm.nih.gov/pubmed/33083145 http://dx.doi.org/10.7717/peerj.10120 |
work_keys_str_mv | AT changhongze integratingmultiplemicroarraydatasetanalysisandmachinelearningmethodstorevealthekeygenesandregulatorymechanismsunderlyinghumanintervertebraldiscdegeneration AT yangxiaolong integratingmultiplemicroarraydatasetanalysisandmachinelearningmethodstorevealthekeygenesandregulatorymechanismsunderlyinghumanintervertebraldiscdegeneration AT youkemin integratingmultiplemicroarraydatasetanalysisandmachinelearningmethodstorevealthekeygenesandregulatorymechanismsunderlyinghumanintervertebraldiscdegeneration AT jiangmingwei integratingmultiplemicroarraydatasetanalysisandmachinelearningmethodstorevealthekeygenesandregulatorymechanismsunderlyinghumanintervertebraldiscdegeneration AT caifeng integratingmultiplemicroarraydatasetanalysisandmachinelearningmethodstorevealthekeygenesandregulatorymechanismsunderlyinghumanintervertebraldiscdegeneration AT zhangyan integratingmultiplemicroarraydatasetanalysisandmachinelearningmethodstorevealthekeygenesandregulatorymechanismsunderlyinghumanintervertebraldiscdegeneration AT liuliang integratingmultiplemicroarraydatasetanalysisandmachinelearningmethodstorevealthekeygenesandregulatorymechanismsunderlyinghumanintervertebraldiscdegeneration AT liuhui integratingmultiplemicroarraydatasetanalysisandmachinelearningmethodstorevealthekeygenesandregulatorymechanismsunderlyinghumanintervertebraldiscdegeneration AT liuxiaodong integratingmultiplemicroarraydatasetanalysisandmachinelearningmethodstorevealthekeygenesandregulatorymechanismsunderlyinghumanintervertebraldiscdegeneration |