Cargando…
Collaborative Automation Reliably Remediating Erroneous Conclusion Threats (CARRECT)
OBJECTIVE: The objective of the CARRECT software is to make cutting edge statistical methods for reducing bias in epidemiological studies easy to use and useful for both novice and expert users. INTRODUCTION: Analyses produced by epidemiologists and public health practitioners are susceptible to bia...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
University of Illinois at Chicago Library
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3692841/ |
_version_ | 1782274668458147840 |
---|---|
author | Lansey, Jonathan C. Picciano, Paul Yohai, Ian Grant, Fred Gern, Robert |
author_facet | Lansey, Jonathan C. Picciano, Paul Yohai, Ian Grant, Fred Gern, Robert |
author_sort | Lansey, Jonathan C. |
collection | PubMed |
description | OBJECTIVE: The objective of the CARRECT software is to make cutting edge statistical methods for reducing bias in epidemiological studies easy to use and useful for both novice and expert users. INTRODUCTION: Analyses produced by epidemiologists and public health practitioners are susceptible to bias from a number of sources including missing data, confounding variables, and statistical model selection. It often requires a great deal of expertise to understand and apply the multitude of tests, corrections, and selection rules, and these tasks can be time-consuming and burdensome. To address this challenge, Aptima began development of CARRECT, the Collaborative Automation Reliably Remediating Erroneous Conclusion Threats system. When complete, CARRECT will provide an expert system that can be embedded in an analyst’s workflow. CARRECT will support statistical bias reduction and improved analyses and decision making by engaging the user in a collaborative process in which the technology is transparent to the analyst. METHODS: Older approaches to imputing missing data, including mean imputation and single imputation regression methods, have steadily given way to a class of methods known as “multiple imputation” (hereafter “MI”; Rubin 1987). Rather than making the restrictive assumption that the data are missing completely at random (MCAR), MI typically assumes the data are missing at random (MAR). There are two key innovations behind MI. First, the observed values can be useful in predicting the missing cells, and thus specifying a joint distribution of the data is the first step in implementing the models. Second, single imputation methods will likely fail not only because of the inherent uncertainty in the missing values but also because of the estimation uncertainty associated with generating the parameters in the imputation procedure itself. By contrast, drawing the missing values multiple times, thereby generating m complete datasets along with the estimated parameters of the model properly accounts for both types of uncertainty (Rubin 1987; King et al. 2001). As a result, MI will lead to valid standard errors and confidence intervals along with unbiased point estimates. In order to compute the joint distribution, CARRECT uses a bootstrapping-based algorithm that gives essentially the same answers as the standard Bayesian Markov Chain Monte Carlo (MCMC) or Expectation Maximization (EM) approaches, is usually considerably faster than existing approaches and can handle many more variables. RESULTS: Tests were conducted on one of the proposed methods with an epidemiological dataset from the Integrated Health Interview Series (IHIS) producing verifiably unbiased results despite high missingness rates. In addition, mockups (Figure 1) were created of an intuitive data wizard that guides the user through the analysis processes by analyzing key features of a given dataset. The mockups also show prompts for the user to provide additional substantive knowledge to improve the handling of imperfect datasets, as well as the selection of the most appropriate algorithms and models. CONCLUSIONS: Our approach and program were designed to make bias mitigation much more accessible to much more than only the statistical elite. We hope that it will have a wide impact on reducing bias in epidemiological studies and provide more accurate information to policymakers. |
format | Online Article Text |
id | pubmed-3692841 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | University of Illinois at Chicago Library |
record_format | MEDLINE/PubMed |
spelling | pubmed-36928412013-06-26 Collaborative Automation Reliably Remediating Erroneous Conclusion Threats (CARRECT) Lansey, Jonathan C. Picciano, Paul Yohai, Ian Grant, Fred Gern, Robert Online J Public Health Inform ISDS 2012 Conference Abstracts OBJECTIVE: The objective of the CARRECT software is to make cutting edge statistical methods for reducing bias in epidemiological studies easy to use and useful for both novice and expert users. INTRODUCTION: Analyses produced by epidemiologists and public health practitioners are susceptible to bias from a number of sources including missing data, confounding variables, and statistical model selection. It often requires a great deal of expertise to understand and apply the multitude of tests, corrections, and selection rules, and these tasks can be time-consuming and burdensome. To address this challenge, Aptima began development of CARRECT, the Collaborative Automation Reliably Remediating Erroneous Conclusion Threats system. When complete, CARRECT will provide an expert system that can be embedded in an analyst’s workflow. CARRECT will support statistical bias reduction and improved analyses and decision making by engaging the user in a collaborative process in which the technology is transparent to the analyst. METHODS: Older approaches to imputing missing data, including mean imputation and single imputation regression methods, have steadily given way to a class of methods known as “multiple imputation” (hereafter “MI”; Rubin 1987). Rather than making the restrictive assumption that the data are missing completely at random (MCAR), MI typically assumes the data are missing at random (MAR). There are two key innovations behind MI. First, the observed values can be useful in predicting the missing cells, and thus specifying a joint distribution of the data is the first step in implementing the models. Second, single imputation methods will likely fail not only because of the inherent uncertainty in the missing values but also because of the estimation uncertainty associated with generating the parameters in the imputation procedure itself. By contrast, drawing the missing values multiple times, thereby generating m complete datasets along with the estimated parameters of the model properly accounts for both types of uncertainty (Rubin 1987; King et al. 2001). As a result, MI will lead to valid standard errors and confidence intervals along with unbiased point estimates. In order to compute the joint distribution, CARRECT uses a bootstrapping-based algorithm that gives essentially the same answers as the standard Bayesian Markov Chain Monte Carlo (MCMC) or Expectation Maximization (EM) approaches, is usually considerably faster than existing approaches and can handle many more variables. RESULTS: Tests were conducted on one of the proposed methods with an epidemiological dataset from the Integrated Health Interview Series (IHIS) producing verifiably unbiased results despite high missingness rates. In addition, mockups (Figure 1) were created of an intuitive data wizard that guides the user through the analysis processes by analyzing key features of a given dataset. The mockups also show prompts for the user to provide additional substantive knowledge to improve the handling of imperfect datasets, as well as the selection of the most appropriate algorithms and models. CONCLUSIONS: Our approach and program were designed to make bias mitigation much more accessible to much more than only the statistical elite. We hope that it will have a wide impact on reducing bias in epidemiological studies and provide more accurate information to policymakers. University of Illinois at Chicago Library 2013-04-04 /pmc/articles/PMC3692841/ Text en ©2013 the author(s) http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/ojphi/about/submissions#copyrightNotice This is an Open Access article. Authors own copyright of their articles appearing in the Online Journal of Public Health Informatics. Readers may copy articles without permission of the copyright owner(s), as long as the author and OJPHI are acknowledged in the copy and the copy is used for educational, not-for-profit purposes. |
spellingShingle | ISDS 2012 Conference Abstracts Lansey, Jonathan C. Picciano, Paul Yohai, Ian Grant, Fred Gern, Robert Collaborative Automation Reliably Remediating Erroneous Conclusion Threats (CARRECT) |
title | Collaborative Automation Reliably Remediating Erroneous Conclusion Threats (CARRECT) |
title_full | Collaborative Automation Reliably Remediating Erroneous Conclusion Threats (CARRECT) |
title_fullStr | Collaborative Automation Reliably Remediating Erroneous Conclusion Threats (CARRECT) |
title_full_unstemmed | Collaborative Automation Reliably Remediating Erroneous Conclusion Threats (CARRECT) |
title_short | Collaborative Automation Reliably Remediating Erroneous Conclusion Threats (CARRECT) |
title_sort | collaborative automation reliably remediating erroneous conclusion threats (carrect) |
topic | ISDS 2012 Conference Abstracts |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3692841/ |
work_keys_str_mv | AT lanseyjonathanc collaborativeautomationreliablyremediatingerroneousconclusionthreatscarrect AT piccianopaul collaborativeautomationreliablyremediatingerroneousconclusionthreatscarrect AT yohaiian collaborativeautomationreliablyremediatingerroneousconclusionthreatscarrect AT grantfred collaborativeautomationreliablyremediatingerroneousconclusionthreatscarrect AT gernrobert collaborativeautomationreliablyremediatingerroneousconclusionthreatscarrect |