Cargando…

Mucopolysaccharidosis type II detection by Naïve Bayes Classifier: An example of patient classification for a rare disease using electronic medical records from the Canadian Primary Care Sentinel Surveillance Network

Identifying patients with rare diseases associated with common symptoms is challenging. Hunter syndrome, or Mucopolysaccharidosis type II is a progressive rare disease caused by a deficiency in the activity of the lysosomal enzyme, iduronate 2-sulphatase. It is inherited in an X-linked manner result...

Descripción completa

Detalles Bibliográficos
Autores principales: Ehsani-Moghaddam, Behrouz, Queenan, John A., MacKenzie, Jennifer, Birtwhistle, Richard V.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6300265/
https://www.ncbi.nlm.nih.gov/pubmed/30566525
http://dx.doi.org/10.1371/journal.pone.0209018
_version_ 1783381645928169472
author Ehsani-Moghaddam, Behrouz
Queenan, John A.
MacKenzie, Jennifer
Birtwhistle, Richard V.
author_facet Ehsani-Moghaddam, Behrouz
Queenan, John A.
MacKenzie, Jennifer
Birtwhistle, Richard V.
author_sort Ehsani-Moghaddam, Behrouz
collection PubMed
description Identifying patients with rare diseases associated with common symptoms is challenging. Hunter syndrome, or Mucopolysaccharidosis type II is a progressive rare disease caused by a deficiency in the activity of the lysosomal enzyme, iduronate 2-sulphatase. It is inherited in an X-linked manner resulting in males being significantly affected. Expression in females varies with the majority being unaffected although symptoms may emerge over time. We developed a Naïve Bayes classification (NBC) algorithm utilizing the clinical diagnosis and symptoms of patients contained within their de-identified and unstructured electronic medical records (EMR) extracted by the Canadian Primary Care Sentinel Surveillance Network (CPCSSN). To do so, we created a training dataset using published results in the scientific literature and from all MPS II symptoms and applied the training dataset and its independent features to compute the conditional posterior probabilities of having MPS II disease as a categorical dependent variable for 506497 male patients. The classifier identified 125 patients with the highest likelihood for having the disease and 18 features were selected to be necessary for forecasting. Next, a Recursive Backward Feature Elimination algorithm was employed, for optimal input features of the NBC model, using a k-fold Cross-Validation with 3 replicates. The accuracy of the final model was estimated by the Validation Set Approach technique and the bootstrap resampling. We also investigated that whether the NBC is as accurate as three other Bayesian networks. The Naïve Bayes Classifier appears to be an efficient algorithm in assisting physicians with the diagnosis of Hunter syndrome allowing optimal patient management.
format Online
Article
Text
id pubmed-6300265
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-63002652018-12-28 Mucopolysaccharidosis type II detection by Naïve Bayes Classifier: An example of patient classification for a rare disease using electronic medical records from the Canadian Primary Care Sentinel Surveillance Network Ehsani-Moghaddam, Behrouz Queenan, John A. MacKenzie, Jennifer Birtwhistle, Richard V. PLoS One Research Article Identifying patients with rare diseases associated with common symptoms is challenging. Hunter syndrome, or Mucopolysaccharidosis type II is a progressive rare disease caused by a deficiency in the activity of the lysosomal enzyme, iduronate 2-sulphatase. It is inherited in an X-linked manner resulting in males being significantly affected. Expression in females varies with the majority being unaffected although symptoms may emerge over time. We developed a Naïve Bayes classification (NBC) algorithm utilizing the clinical diagnosis and symptoms of patients contained within their de-identified and unstructured electronic medical records (EMR) extracted by the Canadian Primary Care Sentinel Surveillance Network (CPCSSN). To do so, we created a training dataset using published results in the scientific literature and from all MPS II symptoms and applied the training dataset and its independent features to compute the conditional posterior probabilities of having MPS II disease as a categorical dependent variable for 506497 male patients. The classifier identified 125 patients with the highest likelihood for having the disease and 18 features were selected to be necessary for forecasting. Next, a Recursive Backward Feature Elimination algorithm was employed, for optimal input features of the NBC model, using a k-fold Cross-Validation with 3 replicates. The accuracy of the final model was estimated by the Validation Set Approach technique and the bootstrap resampling. We also investigated that whether the NBC is as accurate as three other Bayesian networks. The Naïve Bayes Classifier appears to be an efficient algorithm in assisting physicians with the diagnosis of Hunter syndrome allowing optimal patient management. Public Library of Science 2018-12-19 /pmc/articles/PMC6300265/ /pubmed/30566525 http://dx.doi.org/10.1371/journal.pone.0209018 Text en © 2018 Ehsani-Moghaddam et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Ehsani-Moghaddam, Behrouz
Queenan, John A.
MacKenzie, Jennifer
Birtwhistle, Richard V.
Mucopolysaccharidosis type II detection by Naïve Bayes Classifier: An example of patient classification for a rare disease using electronic medical records from the Canadian Primary Care Sentinel Surveillance Network
title Mucopolysaccharidosis type II detection by Naïve Bayes Classifier: An example of patient classification for a rare disease using electronic medical records from the Canadian Primary Care Sentinel Surveillance Network
title_full Mucopolysaccharidosis type II detection by Naïve Bayes Classifier: An example of patient classification for a rare disease using electronic medical records from the Canadian Primary Care Sentinel Surveillance Network
title_fullStr Mucopolysaccharidosis type II detection by Naïve Bayes Classifier: An example of patient classification for a rare disease using electronic medical records from the Canadian Primary Care Sentinel Surveillance Network
title_full_unstemmed Mucopolysaccharidosis type II detection by Naïve Bayes Classifier: An example of patient classification for a rare disease using electronic medical records from the Canadian Primary Care Sentinel Surveillance Network
title_short Mucopolysaccharidosis type II detection by Naïve Bayes Classifier: An example of patient classification for a rare disease using electronic medical records from the Canadian Primary Care Sentinel Surveillance Network
title_sort mucopolysaccharidosis type ii detection by naïve bayes classifier: an example of patient classification for a rare disease using electronic medical records from the canadian primary care sentinel surveillance network
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6300265/
https://www.ncbi.nlm.nih.gov/pubmed/30566525
http://dx.doi.org/10.1371/journal.pone.0209018
work_keys_str_mv AT ehsanimoghaddambehrouz mucopolysaccharidosistypeiidetectionbynaivebayesclassifieranexampleofpatientclassificationforararediseaseusingelectronicmedicalrecordsfromthecanadianprimarycaresentinelsurveillancenetwork
AT queenanjohna mucopolysaccharidosistypeiidetectionbynaivebayesclassifieranexampleofpatientclassificationforararediseaseusingelectronicmedicalrecordsfromthecanadianprimarycaresentinelsurveillancenetwork
AT mackenziejennifer mucopolysaccharidosistypeiidetectionbynaivebayesclassifieranexampleofpatientclassificationforararediseaseusingelectronicmedicalrecordsfromthecanadianprimarycaresentinelsurveillancenetwork
AT birtwhistlerichardv mucopolysaccharidosistypeiidetectionbynaivebayesclassifieranexampleofpatientclassificationforararediseaseusingelectronicmedicalrecordsfromthecanadianprimarycaresentinelsurveillancenetwork