Cargando…
Prediction of absenteeism in public schools teachers with machine learning
OBJECTIVE: To predict the risk of absence from work due to morbidities of teachers working in early childhood education in the municipal public schools, using machine learning algorithms. METHODS: This is a cross-sectional study using secondary, public and anonymous data from the Relação Anual de In...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Faculdade de Saúde Pública da Universidade de São Paulo
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8225323/ https://www.ncbi.nlm.nih.gov/pubmed/34133618 http://dx.doi.org/10.11606/s1518-8787.2021055002677 |
_version_ | 1783712067647176704 |
---|---|
author | Fernandes, Fernando Timoteo Chiavegatto, Alexandre Dias Porto |
author_facet | Fernandes, Fernando Timoteo Chiavegatto, Alexandre Dias Porto |
author_sort | Fernandes, Fernando Timoteo |
collection | PubMed |
description | OBJECTIVE: To predict the risk of absence from work due to morbidities of teachers working in early childhood education in the municipal public schools, using machine learning algorithms. METHODS: This is a cross-sectional study using secondary, public and anonymous data from the Relação Anual de Informações Sociais, selecting early childhood education teachers who worked in the municipal public schools of the state of São Paulo between 2014 and 2018 (n = 174,294). Data on the average number of students per class and number of inhabitants in the municipality were also linked. The data were separated into training and testing, using records from 2014 to 2016 (n = 103,357) to train five predictive models, and data from 2017 to 2018 (n = 70,937) to test their performance in new data. The predictive performance of the algorithms was evaluated using the value of the area under the ROC curve (AUROC). RESULTS: All five algorithms tested showed an area under the curve above 0.76. The algorithm with the best predictive performance (artificial neural networks) achieved 0.79 of area under the curve, with accuracy of 71.52%, sensitivity of 72.86%, specificity of 70.52%, and kappa of 0.427 in the test data. CONCLUSION: It is possible to predict cases of sickness absence in teachers of public schools with machine learning using public data. The best algorithm showed a better result of the area under the curve when compared with the reference model (logistic regression). The algorithms can contribute to more assertive predictions in the public health and worker health areas, allowing to monitor and help prevent the absence of these workers due to morbidity. |
format | Online Article Text |
id | pubmed-8225323 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Faculdade de Saúde Pública da Universidade de São Paulo |
record_format | MEDLINE/PubMed |
spelling | pubmed-82253232021-06-25 Prediction of absenteeism in public schools teachers with machine learning Fernandes, Fernando Timoteo Chiavegatto, Alexandre Dias Porto Rev Saude Publica Original Article OBJECTIVE: To predict the risk of absence from work due to morbidities of teachers working in early childhood education in the municipal public schools, using machine learning algorithms. METHODS: This is a cross-sectional study using secondary, public and anonymous data from the Relação Anual de Informações Sociais, selecting early childhood education teachers who worked in the municipal public schools of the state of São Paulo between 2014 and 2018 (n = 174,294). Data on the average number of students per class and number of inhabitants in the municipality were also linked. The data were separated into training and testing, using records from 2014 to 2016 (n = 103,357) to train five predictive models, and data from 2017 to 2018 (n = 70,937) to test their performance in new data. The predictive performance of the algorithms was evaluated using the value of the area under the ROC curve (AUROC). RESULTS: All five algorithms tested showed an area under the curve above 0.76. The algorithm with the best predictive performance (artificial neural networks) achieved 0.79 of area under the curve, with accuracy of 71.52%, sensitivity of 72.86%, specificity of 70.52%, and kappa of 0.427 in the test data. CONCLUSION: It is possible to predict cases of sickness absence in teachers of public schools with machine learning using public data. The best algorithm showed a better result of the area under the curve when compared with the reference model (logistic regression). The algorithms can contribute to more assertive predictions in the public health and worker health areas, allowing to monitor and help prevent the absence of these workers due to morbidity. Faculdade de Saúde Pública da Universidade de São Paulo 2021-06-07 /pmc/articles/PMC8225323/ /pubmed/34133618 http://dx.doi.org/10.11606/s1518-8787.2021055002677 Text en https://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Article Fernandes, Fernando Timoteo Chiavegatto, Alexandre Dias Porto Prediction of absenteeism in public schools teachers with machine learning |
title | Prediction of absenteeism in public schools teachers with machine learning |
title_full | Prediction of absenteeism in public schools teachers with machine learning |
title_fullStr | Prediction of absenteeism in public schools teachers with machine learning |
title_full_unstemmed | Prediction of absenteeism in public schools teachers with machine learning |
title_short | Prediction of absenteeism in public schools teachers with machine learning |
title_sort | prediction of absenteeism in public schools teachers with machine learning |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8225323/ https://www.ncbi.nlm.nih.gov/pubmed/34133618 http://dx.doi.org/10.11606/s1518-8787.2021055002677 |
work_keys_str_mv | AT fernandesfernandotimoteo predictionofabsenteeisminpublicschoolsteacherswithmachinelearning AT chiavegattoalexandrediasporto predictionofabsenteeisminpublicschoolsteacherswithmachinelearning |