Cargando…

Machine learning identifies girls with central precocious puberty based on multisource data

OBJECTIVE: The study aimed to develop simplified diagnostic models for identifying girls with central precocious puberty (CPP), without the expensive and cumbersome gonadotropin-releasing hormone (GnRH) stimulation test, which is the gold standard for CPP diagnosis. MATERIALS AND METHODS: Female pat...

Descripción completa

Detalles Bibliográficos
Autores principales: Pan, Liyan, Liu, Guangjian, Mao, Xiaojian, Liang, Huiying
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7886559/
https://www.ncbi.nlm.nih.gov/pubmed/33623892
http://dx.doi.org/10.1093/jamiaopen/ooaa063
_version_ 1783651821099679744
author Pan, Liyan
Liu, Guangjian
Mao, Xiaojian
Liang, Huiying
author_facet Pan, Liyan
Liu, Guangjian
Mao, Xiaojian
Liang, Huiying
author_sort Pan, Liyan
collection PubMed
description OBJECTIVE: The study aimed to develop simplified diagnostic models for identifying girls with central precocious puberty (CPP), without the expensive and cumbersome gonadotropin-releasing hormone (GnRH) stimulation test, which is the gold standard for CPP diagnosis. MATERIALS AND METHODS: Female patients who had secondary sexual characteristics before 8 years old and had taken a GnRH analog (GnRHa) stimulation test at a medical center in Guangzhou, China were enrolled. Data from clinical visiting, laboratory tests, and medical image examinations were collected. We first extracted features from unstructured data such as clinical reports and medical images. Then, models based on each single-source data or multisource data were developed with Extreme Gradient Boosting (XGBoost) classifier to classify patients as CPP or non-CPP. RESULTS: The best performance achieved an area under the curve (AUC) of 0.88 and Youden index of 0.64 in the model based on multisource data. The performance of single-source models based on data from basal laboratory tests and the feature importance of each variable showed that the basal hormone test had the highest diagnostic value for a CPP diagnosis. CONCLUSION: We developed three simplified models that use easily accessed clinical data before the GnRH stimulation test to identify girls who are at high risk of CPP. These models are tailored to the needs of patients in different clinical settings. Machine learning technologies and multisource data fusion can help to make a better diagnosis than traditional methods.
format Online
Article
Text
id pubmed-7886559
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-78865592021-02-22 Machine learning identifies girls with central precocious puberty based on multisource data Pan, Liyan Liu, Guangjian Mao, Xiaojian Liang, Huiying JAMIA Open Research and Applications OBJECTIVE: The study aimed to develop simplified diagnostic models for identifying girls with central precocious puberty (CPP), without the expensive and cumbersome gonadotropin-releasing hormone (GnRH) stimulation test, which is the gold standard for CPP diagnosis. MATERIALS AND METHODS: Female patients who had secondary sexual characteristics before 8 years old and had taken a GnRH analog (GnRHa) stimulation test at a medical center in Guangzhou, China were enrolled. Data from clinical visiting, laboratory tests, and medical image examinations were collected. We first extracted features from unstructured data such as clinical reports and medical images. Then, models based on each single-source data or multisource data were developed with Extreme Gradient Boosting (XGBoost) classifier to classify patients as CPP or non-CPP. RESULTS: The best performance achieved an area under the curve (AUC) of 0.88 and Youden index of 0.64 in the model based on multisource data. The performance of single-source models based on data from basal laboratory tests and the feature importance of each variable showed that the basal hormone test had the highest diagnostic value for a CPP diagnosis. CONCLUSION: We developed three simplified models that use easily accessed clinical data before the GnRH stimulation test to identify girls who are at high risk of CPP. These models are tailored to the needs of patients in different clinical settings. Machine learning technologies and multisource data fusion can help to make a better diagnosis than traditional methods. Oxford University Press 2020-12-05 /pmc/articles/PMC7886559/ /pubmed/33623892 http://dx.doi.org/10.1093/jamiaopen/ooaa063 Text en © The Author(s) 2020. Published by Oxford University Press on behalf of the American Medical Informatics Association. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Research and Applications
Pan, Liyan
Liu, Guangjian
Mao, Xiaojian
Liang, Huiying
Machine learning identifies girls with central precocious puberty based on multisource data
title Machine learning identifies girls with central precocious puberty based on multisource data
title_full Machine learning identifies girls with central precocious puberty based on multisource data
title_fullStr Machine learning identifies girls with central precocious puberty based on multisource data
title_full_unstemmed Machine learning identifies girls with central precocious puberty based on multisource data
title_short Machine learning identifies girls with central precocious puberty based on multisource data
title_sort machine learning identifies girls with central precocious puberty based on multisource data
topic Research and Applications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7886559/
https://www.ncbi.nlm.nih.gov/pubmed/33623892
http://dx.doi.org/10.1093/jamiaopen/ooaa063
work_keys_str_mv AT panliyan machinelearningidentifiesgirlswithcentralprecociouspubertybasedonmultisourcedata
AT liuguangjian machinelearningidentifiesgirlswithcentralprecociouspubertybasedonmultisourcedata
AT maoxiaojian machinelearningidentifiesgirlswithcentralprecociouspubertybasedonmultisourcedata
AT lianghuiying machinelearningidentifiesgirlswithcentralprecociouspubertybasedonmultisourcedata