Cargando…
Feature sensitivity criterion-based sampling strategy from the Optimization based on Phylogram Analysis (Fs-OPA) and Cox regression applied to mental disorder datasets
Digital datasets in several health care facilities, as hospitals and prehospital services, accumulated data from thousands of patients for more than a decade. In general, there is no local team with enough experts with the required different skills capable of analyzing them in entirety. The integrat...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7329087/ https://www.ncbi.nlm.nih.gov/pubmed/32609749 http://dx.doi.org/10.1371/journal.pone.0235147 |
_version_ | 1783552846929592320 |
---|---|
author | Gholi Zadeh Kharrat, Fatemeh Shydeo Brandão Miyoshi, Newton Cobre, Juliana Mazzoncini De Azevedo-Marques, João Mazzoncini de Azevedo-Marques, Paulo Cláudio Botazzo Delbem, Alexandre |
author_facet | Gholi Zadeh Kharrat, Fatemeh Shydeo Brandão Miyoshi, Newton Cobre, Juliana Mazzoncini De Azevedo-Marques, João Mazzoncini de Azevedo-Marques, Paulo Cláudio Botazzo Delbem, Alexandre |
author_sort | Gholi Zadeh Kharrat, Fatemeh |
collection | PubMed |
description | Digital datasets in several health care facilities, as hospitals and prehospital services, accumulated data from thousands of patients for more than a decade. In general, there is no local team with enough experts with the required different skills capable of analyzing them in entirety. The integration of those abilities usually demands a relatively long-period and is cost. Considering that scenario, this paper proposes a new Feature Sensitivity technique that can automatically deal with a large dataset. It uses a criterion-based sampling strategy from the Optimization based on Phylogram Analysis. Called FS-opa, the new approach seems proper for dealing with any types of raw data from health centers and manipulate their entire datasets. Besides, FS-opa can find the principal features for the construction of inference models without depending on expert knowledge of the problem domain. The selected features can be combined with usual statistical or machine learning methods to perform predictions. The new method can mine entire datasets from scratch. FS-opa was evaluated using a relatively large dataset from electronic health records of mental disorder prehospital services in Brazil. Cox’s approach was integrated to FS-opa to generate survival analysis models related to the length of stay (LOS) in hospitals, assuming that it is a relevant aspect that can benefit estimates of the efficiency of hospitals and the quality of patient treatments. Since FS-opa can work with raw datasets, no knowledge from the problem domain was used to obtain the preliminary prediction models found. Results show that FS-opa succeeded in performing a feature sensitivity analysis using only the raw data available. In this way, FS-opa can find the principal features without bias of an inference model, since the proposed method does not use it. Moreover, the experiments show that FS-opa can provide models with a useful trade-off according to their representativeness and parsimony. It can benefit further analyses by experts since they can focus on aspects that benefit problem modeling. |
format | Online Article Text |
id | pubmed-7329087 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-73290872020-07-14 Feature sensitivity criterion-based sampling strategy from the Optimization based on Phylogram Analysis (Fs-OPA) and Cox regression applied to mental disorder datasets Gholi Zadeh Kharrat, Fatemeh Shydeo Brandão Miyoshi, Newton Cobre, Juliana Mazzoncini De Azevedo-Marques, João Mazzoncini de Azevedo-Marques, Paulo Cláudio Botazzo Delbem, Alexandre PLoS One Research Article Digital datasets in several health care facilities, as hospitals and prehospital services, accumulated data from thousands of patients for more than a decade. In general, there is no local team with enough experts with the required different skills capable of analyzing them in entirety. The integration of those abilities usually demands a relatively long-period and is cost. Considering that scenario, this paper proposes a new Feature Sensitivity technique that can automatically deal with a large dataset. It uses a criterion-based sampling strategy from the Optimization based on Phylogram Analysis. Called FS-opa, the new approach seems proper for dealing with any types of raw data from health centers and manipulate their entire datasets. Besides, FS-opa can find the principal features for the construction of inference models without depending on expert knowledge of the problem domain. The selected features can be combined with usual statistical or machine learning methods to perform predictions. The new method can mine entire datasets from scratch. FS-opa was evaluated using a relatively large dataset from electronic health records of mental disorder prehospital services in Brazil. Cox’s approach was integrated to FS-opa to generate survival analysis models related to the length of stay (LOS) in hospitals, assuming that it is a relevant aspect that can benefit estimates of the efficiency of hospitals and the quality of patient treatments. Since FS-opa can work with raw datasets, no knowledge from the problem domain was used to obtain the preliminary prediction models found. Results show that FS-opa succeeded in performing a feature sensitivity analysis using only the raw data available. In this way, FS-opa can find the principal features without bias of an inference model, since the proposed method does not use it. Moreover, the experiments show that FS-opa can provide models with a useful trade-off according to their representativeness and parsimony. It can benefit further analyses by experts since they can focus on aspects that benefit problem modeling. Public Library of Science 2020-07-01 /pmc/articles/PMC7329087/ /pubmed/32609749 http://dx.doi.org/10.1371/journal.pone.0235147 Text en © 2020 Gholi Zadeh Kharrat et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Gholi Zadeh Kharrat, Fatemeh Shydeo Brandão Miyoshi, Newton Cobre, Juliana Mazzoncini De Azevedo-Marques, João Mazzoncini de Azevedo-Marques, Paulo Cláudio Botazzo Delbem, Alexandre Feature sensitivity criterion-based sampling strategy from the Optimization based on Phylogram Analysis (Fs-OPA) and Cox regression applied to mental disorder datasets |
title | Feature sensitivity criterion-based sampling strategy from the Optimization based on Phylogram Analysis (Fs-OPA) and Cox regression applied to mental disorder datasets |
title_full | Feature sensitivity criterion-based sampling strategy from the Optimization based on Phylogram Analysis (Fs-OPA) and Cox regression applied to mental disorder datasets |
title_fullStr | Feature sensitivity criterion-based sampling strategy from the Optimization based on Phylogram Analysis (Fs-OPA) and Cox regression applied to mental disorder datasets |
title_full_unstemmed | Feature sensitivity criterion-based sampling strategy from the Optimization based on Phylogram Analysis (Fs-OPA) and Cox regression applied to mental disorder datasets |
title_short | Feature sensitivity criterion-based sampling strategy from the Optimization based on Phylogram Analysis (Fs-OPA) and Cox regression applied to mental disorder datasets |
title_sort | feature sensitivity criterion-based sampling strategy from the optimization based on phylogram analysis (fs-opa) and cox regression applied to mental disorder datasets |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7329087/ https://www.ncbi.nlm.nih.gov/pubmed/32609749 http://dx.doi.org/10.1371/journal.pone.0235147 |
work_keys_str_mv | AT gholizadehkharratfatemeh featuresensitivitycriterionbasedsamplingstrategyfromtheoptimizationbasedonphylogramanalysisfsopaandcoxregressionappliedtomentaldisorderdatasets AT shydeobrandaomiyoshinewton featuresensitivitycriterionbasedsamplingstrategyfromtheoptimizationbasedonphylogramanalysisfsopaandcoxregressionappliedtomentaldisorderdatasets AT cobrejuliana featuresensitivitycriterionbasedsamplingstrategyfromtheoptimizationbasedonphylogramanalysisfsopaandcoxregressionappliedtomentaldisorderdatasets AT mazzoncinideazevedomarquesjoao featuresensitivitycriterionbasedsamplingstrategyfromtheoptimizationbasedonphylogramanalysisfsopaandcoxregressionappliedtomentaldisorderdatasets AT mazzoncinideazevedomarquespaulo featuresensitivitycriterionbasedsamplingstrategyfromtheoptimizationbasedonphylogramanalysisfsopaandcoxregressionappliedtomentaldisorderdatasets AT claudiobotazzodelbemalexandre featuresensitivitycriterionbasedsamplingstrategyfromtheoptimizationbasedonphylogramanalysisfsopaandcoxregressionappliedtomentaldisorderdatasets |