Cargando…

Integrating multiple microarray dataset analysis and machine learning methods to reveal the key genes and regulatory mechanisms underlying human intervertebral disc degeneration

Intervertebral disc degeneration (IDD), a major cause of lower back pain, has multiple contributing factors including genetics, environment, age, and loading history. Bioinformatics analysis has been extensively used to identify diagnostic biomarkers and therapeutic targets for IDD diagnosis and tre...

Descripción completa

Detalles Bibliográficos
Autores principales: Chang, Hongze, Yang, Xiaolong, You, Kemin, Jiang, Mingwei, Cai, Feng, Zhang, Yan, Liu, Liang, Liu, Hui, Liu, Xiaodong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7566771/
https://www.ncbi.nlm.nih.gov/pubmed/33083145
http://dx.doi.org/10.7717/peerj.10120
_version_ 1783596193313456128
author Chang, Hongze
Yang, Xiaolong
You, Kemin
Jiang, Mingwei
Cai, Feng
Zhang, Yan
Liu, Liang
Liu, Hui
Liu, Xiaodong
author_facet Chang, Hongze
Yang, Xiaolong
You, Kemin
Jiang, Mingwei
Cai, Feng
Zhang, Yan
Liu, Liang
Liu, Hui
Liu, Xiaodong
author_sort Chang, Hongze
collection PubMed
description Intervertebral disc degeneration (IDD), a major cause of lower back pain, has multiple contributing factors including genetics, environment, age, and loading history. Bioinformatics analysis has been extensively used to identify diagnostic biomarkers and therapeutic targets for IDD diagnosis and treatment. However, multiple microarray dataset analysis and machine learning methods have not been integrated. In this study, we downloaded the mRNA, microRNA (miRNA), long noncoding RNA (lncRNA), and circular RNA (circRNA) expression profiles (GSE34095, GSE15227, GSE63492 GSE116726, GSE56081 and GSE67566) associated with IDD from the GEO database. Using differential expression analysis and recursive feature elimination, we extracted four optimal feature genes. We then used the support vector machine (SVM) to make a classification model with the four optimal feature genes. The ROC curve was used to evaluate the model’s performance, and the expression profiles (GSE63492, GSE116726, GSE56081, and GSE67566) were used to construct a competitive endogenous RNA (ceRNA) regulatory network and explore the underlying mechanisms of the feature genes. We found that three miRNAs (hsa-miR-4728-5p, hsa-miR-5196-5p, and hsa-miR-185-5p) and three circRNAs (hsa_circRNA_100723, hsa_circRNA_104471, and hsa_circRNA_100750) were important regulators with more interactions than the other RNAs across the whole network. The expression level analysis of the three datasets revealed that BCAS4 and SCRG1 were key genes involved in IDD development. Ultimately, our study proposes a novel approach to determining reliable and effective targets in IDD diagnosis and treatment.
format Online
Article
Text
id pubmed-7566771
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-75667712020-10-19 Integrating multiple microarray dataset analysis and machine learning methods to reveal the key genes and regulatory mechanisms underlying human intervertebral disc degeneration Chang, Hongze Yang, Xiaolong You, Kemin Jiang, Mingwei Cai, Feng Zhang, Yan Liu, Liang Liu, Hui Liu, Xiaodong PeerJ Bioinformatics Intervertebral disc degeneration (IDD), a major cause of lower back pain, has multiple contributing factors including genetics, environment, age, and loading history. Bioinformatics analysis has been extensively used to identify diagnostic biomarkers and therapeutic targets for IDD diagnosis and treatment. However, multiple microarray dataset analysis and machine learning methods have not been integrated. In this study, we downloaded the mRNA, microRNA (miRNA), long noncoding RNA (lncRNA), and circular RNA (circRNA) expression profiles (GSE34095, GSE15227, GSE63492 GSE116726, GSE56081 and GSE67566) associated with IDD from the GEO database. Using differential expression analysis and recursive feature elimination, we extracted four optimal feature genes. We then used the support vector machine (SVM) to make a classification model with the four optimal feature genes. The ROC curve was used to evaluate the model’s performance, and the expression profiles (GSE63492, GSE116726, GSE56081, and GSE67566) were used to construct a competitive endogenous RNA (ceRNA) regulatory network and explore the underlying mechanisms of the feature genes. We found that three miRNAs (hsa-miR-4728-5p, hsa-miR-5196-5p, and hsa-miR-185-5p) and three circRNAs (hsa_circRNA_100723, hsa_circRNA_104471, and hsa_circRNA_100750) were important regulators with more interactions than the other RNAs across the whole network. The expression level analysis of the three datasets revealed that BCAS4 and SCRG1 were key genes involved in IDD development. Ultimately, our study proposes a novel approach to determining reliable and effective targets in IDD diagnosis and treatment. PeerJ Inc. 2020-10-13 /pmc/articles/PMC7566771/ /pubmed/33083145 http://dx.doi.org/10.7717/peerj.10120 Text en ©2020 Chang et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Chang, Hongze
Yang, Xiaolong
You, Kemin
Jiang, Mingwei
Cai, Feng
Zhang, Yan
Liu, Liang
Liu, Hui
Liu, Xiaodong
Integrating multiple microarray dataset analysis and machine learning methods to reveal the key genes and regulatory mechanisms underlying human intervertebral disc degeneration
title Integrating multiple microarray dataset analysis and machine learning methods to reveal the key genes and regulatory mechanisms underlying human intervertebral disc degeneration
title_full Integrating multiple microarray dataset analysis and machine learning methods to reveal the key genes and regulatory mechanisms underlying human intervertebral disc degeneration
title_fullStr Integrating multiple microarray dataset analysis and machine learning methods to reveal the key genes and regulatory mechanisms underlying human intervertebral disc degeneration
title_full_unstemmed Integrating multiple microarray dataset analysis and machine learning methods to reveal the key genes and regulatory mechanisms underlying human intervertebral disc degeneration
title_short Integrating multiple microarray dataset analysis and machine learning methods to reveal the key genes and regulatory mechanisms underlying human intervertebral disc degeneration
title_sort integrating multiple microarray dataset analysis and machine learning methods to reveal the key genes and regulatory mechanisms underlying human intervertebral disc degeneration
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7566771/
https://www.ncbi.nlm.nih.gov/pubmed/33083145
http://dx.doi.org/10.7717/peerj.10120
work_keys_str_mv AT changhongze integratingmultiplemicroarraydatasetanalysisandmachinelearningmethodstorevealthekeygenesandregulatorymechanismsunderlyinghumanintervertebraldiscdegeneration
AT yangxiaolong integratingmultiplemicroarraydatasetanalysisandmachinelearningmethodstorevealthekeygenesandregulatorymechanismsunderlyinghumanintervertebraldiscdegeneration
AT youkemin integratingmultiplemicroarraydatasetanalysisandmachinelearningmethodstorevealthekeygenesandregulatorymechanismsunderlyinghumanintervertebraldiscdegeneration
AT jiangmingwei integratingmultiplemicroarraydatasetanalysisandmachinelearningmethodstorevealthekeygenesandregulatorymechanismsunderlyinghumanintervertebraldiscdegeneration
AT caifeng integratingmultiplemicroarraydatasetanalysisandmachinelearningmethodstorevealthekeygenesandregulatorymechanismsunderlyinghumanintervertebraldiscdegeneration
AT zhangyan integratingmultiplemicroarraydatasetanalysisandmachinelearningmethodstorevealthekeygenesandregulatorymechanismsunderlyinghumanintervertebraldiscdegeneration
AT liuliang integratingmultiplemicroarraydatasetanalysisandmachinelearningmethodstorevealthekeygenesandregulatorymechanismsunderlyinghumanintervertebraldiscdegeneration
AT liuhui integratingmultiplemicroarraydatasetanalysisandmachinelearningmethodstorevealthekeygenesandregulatorymechanismsunderlyinghumanintervertebraldiscdegeneration
AT liuxiaodong integratingmultiplemicroarraydatasetanalysisandmachinelearningmethodstorevealthekeygenesandregulatorymechanismsunderlyinghumanintervertebraldiscdegeneration