Cargando…

Explaining machine learning based diagnosis of COVID-19 from routine blood tests with decision trees and criteria graphs

The sudden outbreak of coronavirus disease 2019 (COVID-19) revealed the need for fast and reliable automatic tools to help health teams. This paper aims to present understandable solutions based on Machine Learning (ML) techniques to deal with COVID-19 screening in routine blood tests. We tested dif...

Descripción completa

Detalles Bibliográficos
Autores principales: Alves, Marcos Antonio, Castro, Giulia Zanon, Oliveira, Bruno Alberto Soares, Ferreira, Leonardo Augusto, Ramírez, Jaime Arturo, Silva, Rodrigo, Guimarães, Frederico Gadelha
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier Ltd. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7962588/
https://www.ncbi.nlm.nih.gov/pubmed/33812263
http://dx.doi.org/10.1016/j.compbiomed.2021.104335
_version_ 1783665495573004288
author Alves, Marcos Antonio
Castro, Giulia Zanon
Oliveira, Bruno Alberto Soares
Ferreira, Leonardo Augusto
Ramírez, Jaime Arturo
Silva, Rodrigo
Guimarães, Frederico Gadelha
author_facet Alves, Marcos Antonio
Castro, Giulia Zanon
Oliveira, Bruno Alberto Soares
Ferreira, Leonardo Augusto
Ramírez, Jaime Arturo
Silva, Rodrigo
Guimarães, Frederico Gadelha
author_sort Alves, Marcos Antonio
collection PubMed
description The sudden outbreak of coronavirus disease 2019 (COVID-19) revealed the need for fast and reliable automatic tools to help health teams. This paper aims to present understandable solutions based on Machine Learning (ML) techniques to deal with COVID-19 screening in routine blood tests. We tested different ML classifiers in a public dataset from the Hospital Albert Einstein, São Paulo, Brazil. After cleaning and pre-processing the data has 608 patients, of which 84 are positive for COVID-19 confirmed by RT-PCR. To understand the model decisions, we introduce (i) a local Decision Tree Explainer (DTX) for local explanation and (ii) a Criteria Graph to aggregate these explanations and portrait a global picture of the results. Random Forest (RF) classifier achieved the best results (accuracy 0.88, F1–score 0.76, sensitivity 0.66, specificity 0.91, and AUROC 0.86). By using DTX and Criteria Graph for cases confirmed by the RF, it was possible to find some patterns among the individuals able to aid the clinicians to understand the interconnection among the blood parameters either globally or on a case-by-case basis. The results are in accordance with the literature and the proposed methodology may be embedded in an electronic health record system.
format Online
Article
Text
id pubmed-7962588
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Elsevier Ltd.
record_format MEDLINE/PubMed
spelling pubmed-79625882021-03-16 Explaining machine learning based diagnosis of COVID-19 from routine blood tests with decision trees and criteria graphs Alves, Marcos Antonio Castro, Giulia Zanon Oliveira, Bruno Alberto Soares Ferreira, Leonardo Augusto Ramírez, Jaime Arturo Silva, Rodrigo Guimarães, Frederico Gadelha Comput Biol Med Article The sudden outbreak of coronavirus disease 2019 (COVID-19) revealed the need for fast and reliable automatic tools to help health teams. This paper aims to present understandable solutions based on Machine Learning (ML) techniques to deal with COVID-19 screening in routine blood tests. We tested different ML classifiers in a public dataset from the Hospital Albert Einstein, São Paulo, Brazil. After cleaning and pre-processing the data has 608 patients, of which 84 are positive for COVID-19 confirmed by RT-PCR. To understand the model decisions, we introduce (i) a local Decision Tree Explainer (DTX) for local explanation and (ii) a Criteria Graph to aggregate these explanations and portrait a global picture of the results. Random Forest (RF) classifier achieved the best results (accuracy 0.88, F1–score 0.76, sensitivity 0.66, specificity 0.91, and AUROC 0.86). By using DTX and Criteria Graph for cases confirmed by the RF, it was possible to find some patterns among the individuals able to aid the clinicians to understand the interconnection among the blood parameters either globally or on a case-by-case basis. The results are in accordance with the literature and the proposed methodology may be embedded in an electronic health record system. Elsevier Ltd. 2021-05 2021-03-16 /pmc/articles/PMC7962588/ /pubmed/33812263 http://dx.doi.org/10.1016/j.compbiomed.2021.104335 Text en © 2021 Elsevier Ltd. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle Article
Alves, Marcos Antonio
Castro, Giulia Zanon
Oliveira, Bruno Alberto Soares
Ferreira, Leonardo Augusto
Ramírez, Jaime Arturo
Silva, Rodrigo
Guimarães, Frederico Gadelha
Explaining machine learning based diagnosis of COVID-19 from routine blood tests with decision trees and criteria graphs
title Explaining machine learning based diagnosis of COVID-19 from routine blood tests with decision trees and criteria graphs
title_full Explaining machine learning based diagnosis of COVID-19 from routine blood tests with decision trees and criteria graphs
title_fullStr Explaining machine learning based diagnosis of COVID-19 from routine blood tests with decision trees and criteria graphs
title_full_unstemmed Explaining machine learning based diagnosis of COVID-19 from routine blood tests with decision trees and criteria graphs
title_short Explaining machine learning based diagnosis of COVID-19 from routine blood tests with decision trees and criteria graphs
title_sort explaining machine learning based diagnosis of covid-19 from routine blood tests with decision trees and criteria graphs
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7962588/
https://www.ncbi.nlm.nih.gov/pubmed/33812263
http://dx.doi.org/10.1016/j.compbiomed.2021.104335
work_keys_str_mv AT alvesmarcosantonio explainingmachinelearningbaseddiagnosisofcovid19fromroutinebloodtestswithdecisiontreesandcriteriagraphs
AT castrogiuliazanon explainingmachinelearningbaseddiagnosisofcovid19fromroutinebloodtestswithdecisiontreesandcriteriagraphs
AT oliveirabrunoalbertosoares explainingmachinelearningbaseddiagnosisofcovid19fromroutinebloodtestswithdecisiontreesandcriteriagraphs
AT ferreiraleonardoaugusto explainingmachinelearningbaseddiagnosisofcovid19fromroutinebloodtestswithdecisiontreesandcriteriagraphs
AT ramirezjaimearturo explainingmachinelearningbaseddiagnosisofcovid19fromroutinebloodtestswithdecisiontreesandcriteriagraphs
AT silvarodrigo explainingmachinelearningbaseddiagnosisofcovid19fromroutinebloodtestswithdecisiontreesandcriteriagraphs
AT guimaraesfredericogadelha explainingmachinelearningbaseddiagnosisofcovid19fromroutinebloodtestswithdecisiontreesandcriteriagraphs