Cargando…

Machine learning analysis of gene expression profile reveals a novel diagnostic signature for osteoporosis

BACKGROUND: Osteoporosis (OP) is increasingly prevalent with the aging of the world population. It is urgent to identify efficient diagnostic signatures for the clinical application. METHOD: We downloaded the mRNA profile of 90 peripheral blood samples with or without OP from GEO database (Number: G...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Xinlei, Liu, Guangping, Wang, Shuxiang, Zhang, Haiyang, Xue, Peng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7958453/
https://www.ncbi.nlm.nih.gov/pubmed/33722258
http://dx.doi.org/10.1186/s13018-021-02329-1
Descripción
Sumario:BACKGROUND: Osteoporosis (OP) is increasingly prevalent with the aging of the world population. It is urgent to identify efficient diagnostic signatures for the clinical application. METHOD: We downloaded the mRNA profile of 90 peripheral blood samples with or without OP from GEO database (Number: GSE152073). Weighted gene co-expression network analysis (WGCNA) was used to reveal the correlation among genes in all samples. GO term and KEGG pathway enrichment analysis was performed via the clusterProfiler R package. STRING database was applied to screen the interaction pairs among proteins. Protein–protein interaction (PPI) network was visualized based on Cytoscape, and the key genes were screened using the cytoHubba plug-in. The diagnostic model based on these key genes was constructed, and 5-fold cross validation method was applied to evaluate its reliability. RESULTS: A gene module consisted of 176 genes predicted to be associated with the occurrence of OP was identified. A total of 16 significantly enriched GO terms and 1 significantly enriched KEGG pathway were obtained based on the 176 genes. The top 50 key genes in the PPI network were identified. Then 22 genes were screened based on stepwise regression analysis from the 50 key genes. Of which, 9 genes were further screened out by multivariate regression analysis with the significant threshold of P value < 0.01. The diagnostic model was established based on the optimal 9 key genes, which efficiently separated the normal samples and OP samples. CONCLUSION: A diagnostic model established based on nine key genes could reliably separate OP patients from healthy subjects, which provided novel lightings on the diagnostic research of OP. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13018-021-02329-1.