Cargando…

What variables are important in predicting bovine viral diarrhea virus? A random forest approach

Bovine viral diarrhea virus (BVDV) causes one of the most economically important diseases in cattle, and the virus is found worldwide. A better understanding of the disease associated factors is a crucial step towards the definition of strategies for control and eradication. In this study we trained...

Descripción completa

Detalles Bibliográficos
Autores principales: Machado, Gustavo, Mendoza, Mariana Recamonde, Corbellini, Luis Gustavo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4513962/
https://www.ncbi.nlm.nih.gov/pubmed/26208851
http://dx.doi.org/10.1186/s13567-015-0219-7
_version_ 1782382726520766464
author Machado, Gustavo
Mendoza, Mariana Recamonde
Corbellini, Luis Gustavo
author_facet Machado, Gustavo
Mendoza, Mariana Recamonde
Corbellini, Luis Gustavo
author_sort Machado, Gustavo
collection PubMed
description Bovine viral diarrhea virus (BVDV) causes one of the most economically important diseases in cattle, and the virus is found worldwide. A better understanding of the disease associated factors is a crucial step towards the definition of strategies for control and eradication. In this study we trained a random forest (RF) prediction model and performed variable importance analysis to identify factors associated with BVDV occurrence. In addition, we assessed the influence of features selection on RF performance and evaluated its predictive power relative to other popular classifiers and to logistic regression. We found that RF classification model resulted in an average error rate of 32.03% for the negative class (negative for BVDV) and 36.78% for the positive class (positive for BVDV).The RF model presented area under the ROC curve equal to 0.702. Variable importance analysis revealed that important predictors of BVDV occurrence were: a) who inseminates the animals, b) number of neighboring farms that have cattle and c) rectal palpation performed routinely. Our results suggest that the use of machine learning algorithms, especially RF, is a promising methodology for the analysis of cross-sectional studies, presenting a satisfactory predictive power and the ability to identify predictors that represent potential risk factors for BVDV investigation. We examined classical predictors and found some new and hard to control practices that may lead to the spread of this disease within and among farms, mainly regarding poor or neglected reproduction management, which should be considered for disease control and eradication. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13567-015-0219-7) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4513962
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-45139622015-07-25 What variables are important in predicting bovine viral diarrhea virus? A random forest approach Machado, Gustavo Mendoza, Mariana Recamonde Corbellini, Luis Gustavo Vet Res Research Article Bovine viral diarrhea virus (BVDV) causes one of the most economically important diseases in cattle, and the virus is found worldwide. A better understanding of the disease associated factors is a crucial step towards the definition of strategies for control and eradication. In this study we trained a random forest (RF) prediction model and performed variable importance analysis to identify factors associated with BVDV occurrence. In addition, we assessed the influence of features selection on RF performance and evaluated its predictive power relative to other popular classifiers and to logistic regression. We found that RF classification model resulted in an average error rate of 32.03% for the negative class (negative for BVDV) and 36.78% for the positive class (positive for BVDV).The RF model presented area under the ROC curve equal to 0.702. Variable importance analysis revealed that important predictors of BVDV occurrence were: a) who inseminates the animals, b) number of neighboring farms that have cattle and c) rectal palpation performed routinely. Our results suggest that the use of machine learning algorithms, especially RF, is a promising methodology for the analysis of cross-sectional studies, presenting a satisfactory predictive power and the ability to identify predictors that represent potential risk factors for BVDV investigation. We examined classical predictors and found some new and hard to control practices that may lead to the spread of this disease within and among farms, mainly regarding poor or neglected reproduction management, which should be considered for disease control and eradication. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13567-015-0219-7) contains supplementary material, which is available to authorized users. BioMed Central 2015-07-24 2015 /pmc/articles/PMC4513962/ /pubmed/26208851 http://dx.doi.org/10.1186/s13567-015-0219-7 Text en © Machado et al. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Machado, Gustavo
Mendoza, Mariana Recamonde
Corbellini, Luis Gustavo
What variables are important in predicting bovine viral diarrhea virus? A random forest approach
title What variables are important in predicting bovine viral diarrhea virus? A random forest approach
title_full What variables are important in predicting bovine viral diarrhea virus? A random forest approach
title_fullStr What variables are important in predicting bovine viral diarrhea virus? A random forest approach
title_full_unstemmed What variables are important in predicting bovine viral diarrhea virus? A random forest approach
title_short What variables are important in predicting bovine viral diarrhea virus? A random forest approach
title_sort what variables are important in predicting bovine viral diarrhea virus? a random forest approach
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4513962/
https://www.ncbi.nlm.nih.gov/pubmed/26208851
http://dx.doi.org/10.1186/s13567-015-0219-7
work_keys_str_mv AT machadogustavo whatvariablesareimportantinpredictingbovineviraldiarrheavirusarandomforestapproach
AT mendozamarianarecamonde whatvariablesareimportantinpredictingbovineviraldiarrheavirusarandomforestapproach
AT corbelliniluisgustavo whatvariablesareimportantinpredictingbovineviraldiarrheavirusarandomforestapproach