Cargando…
Explaining machine learning based diagnosis of COVID-19 from routine blood tests with decision trees and criteria graphs
The sudden outbreak of coronavirus disease 2019 (COVID-19) revealed the need for fast and reliable automatic tools to help health teams. This paper aims to present understandable solutions based on Machine Learning (ML) techniques to deal with COVID-19 screening in routine blood tests. We tested dif...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier Ltd.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7962588/ https://www.ncbi.nlm.nih.gov/pubmed/33812263 http://dx.doi.org/10.1016/j.compbiomed.2021.104335 |
_version_ | 1783665495573004288 |
---|---|
author | Alves, Marcos Antonio Castro, Giulia Zanon Oliveira, Bruno Alberto Soares Ferreira, Leonardo Augusto Ramírez, Jaime Arturo Silva, Rodrigo Guimarães, Frederico Gadelha |
author_facet | Alves, Marcos Antonio Castro, Giulia Zanon Oliveira, Bruno Alberto Soares Ferreira, Leonardo Augusto Ramírez, Jaime Arturo Silva, Rodrigo Guimarães, Frederico Gadelha |
author_sort | Alves, Marcos Antonio |
collection | PubMed |
description | The sudden outbreak of coronavirus disease 2019 (COVID-19) revealed the need for fast and reliable automatic tools to help health teams. This paper aims to present understandable solutions based on Machine Learning (ML) techniques to deal with COVID-19 screening in routine blood tests. We tested different ML classifiers in a public dataset from the Hospital Albert Einstein, São Paulo, Brazil. After cleaning and pre-processing the data has 608 patients, of which 84 are positive for COVID-19 confirmed by RT-PCR. To understand the model decisions, we introduce (i) a local Decision Tree Explainer (DTX) for local explanation and (ii) a Criteria Graph to aggregate these explanations and portrait a global picture of the results. Random Forest (RF) classifier achieved the best results (accuracy 0.88, F1–score 0.76, sensitivity 0.66, specificity 0.91, and AUROC 0.86). By using DTX and Criteria Graph for cases confirmed by the RF, it was possible to find some patterns among the individuals able to aid the clinicians to understand the interconnection among the blood parameters either globally or on a case-by-case basis. The results are in accordance with the literature and the proposed methodology may be embedded in an electronic health record system. |
format | Online Article Text |
id | pubmed-7962588 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Elsevier Ltd. |
record_format | MEDLINE/PubMed |
spelling | pubmed-79625882021-03-16 Explaining machine learning based diagnosis of COVID-19 from routine blood tests with decision trees and criteria graphs Alves, Marcos Antonio Castro, Giulia Zanon Oliveira, Bruno Alberto Soares Ferreira, Leonardo Augusto Ramírez, Jaime Arturo Silva, Rodrigo Guimarães, Frederico Gadelha Comput Biol Med Article The sudden outbreak of coronavirus disease 2019 (COVID-19) revealed the need for fast and reliable automatic tools to help health teams. This paper aims to present understandable solutions based on Machine Learning (ML) techniques to deal with COVID-19 screening in routine blood tests. We tested different ML classifiers in a public dataset from the Hospital Albert Einstein, São Paulo, Brazil. After cleaning and pre-processing the data has 608 patients, of which 84 are positive for COVID-19 confirmed by RT-PCR. To understand the model decisions, we introduce (i) a local Decision Tree Explainer (DTX) for local explanation and (ii) a Criteria Graph to aggregate these explanations and portrait a global picture of the results. Random Forest (RF) classifier achieved the best results (accuracy 0.88, F1–score 0.76, sensitivity 0.66, specificity 0.91, and AUROC 0.86). By using DTX and Criteria Graph for cases confirmed by the RF, it was possible to find some patterns among the individuals able to aid the clinicians to understand the interconnection among the blood parameters either globally or on a case-by-case basis. The results are in accordance with the literature and the proposed methodology may be embedded in an electronic health record system. Elsevier Ltd. 2021-05 2021-03-16 /pmc/articles/PMC7962588/ /pubmed/33812263 http://dx.doi.org/10.1016/j.compbiomed.2021.104335 Text en © 2021 Elsevier Ltd. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active. |
spellingShingle | Article Alves, Marcos Antonio Castro, Giulia Zanon Oliveira, Bruno Alberto Soares Ferreira, Leonardo Augusto Ramírez, Jaime Arturo Silva, Rodrigo Guimarães, Frederico Gadelha Explaining machine learning based diagnosis of COVID-19 from routine blood tests with decision trees and criteria graphs |
title | Explaining machine learning based diagnosis of COVID-19 from routine blood tests with decision trees and criteria graphs |
title_full | Explaining machine learning based diagnosis of COVID-19 from routine blood tests with decision trees and criteria graphs |
title_fullStr | Explaining machine learning based diagnosis of COVID-19 from routine blood tests with decision trees and criteria graphs |
title_full_unstemmed | Explaining machine learning based diagnosis of COVID-19 from routine blood tests with decision trees and criteria graphs |
title_short | Explaining machine learning based diagnosis of COVID-19 from routine blood tests with decision trees and criteria graphs |
title_sort | explaining machine learning based diagnosis of covid-19 from routine blood tests with decision trees and criteria graphs |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7962588/ https://www.ncbi.nlm.nih.gov/pubmed/33812263 http://dx.doi.org/10.1016/j.compbiomed.2021.104335 |
work_keys_str_mv | AT alvesmarcosantonio explainingmachinelearningbaseddiagnosisofcovid19fromroutinebloodtestswithdecisiontreesandcriteriagraphs AT castrogiuliazanon explainingmachinelearningbaseddiagnosisofcovid19fromroutinebloodtestswithdecisiontreesandcriteriagraphs AT oliveirabrunoalbertosoares explainingmachinelearningbaseddiagnosisofcovid19fromroutinebloodtestswithdecisiontreesandcriteriagraphs AT ferreiraleonardoaugusto explainingmachinelearningbaseddiagnosisofcovid19fromroutinebloodtestswithdecisiontreesandcriteriagraphs AT ramirezjaimearturo explainingmachinelearningbaseddiagnosisofcovid19fromroutinebloodtestswithdecisiontreesandcriteriagraphs AT silvarodrigo explainingmachinelearningbaseddiagnosisofcovid19fromroutinebloodtestswithdecisiontreesandcriteriagraphs AT guimaraesfredericogadelha explainingmachinelearningbaseddiagnosisofcovid19fromroutinebloodtestswithdecisiontreesandcriteriagraphs |