Cargando…

Extending Classification Algorithms to Case-Control Studies

Classification is a common technique applied to ’omics data to build predictive models and identify potential markers of biomedical outcomes. Despite the prevalence of case-control studies, the number of classification methods available to analyze data generated by such studies is extremely limited....

Descripción completa

Detalles Bibliográficos
Autores principales: Stanfill, Bryan, Reehl, Sarah, Bramer, Lisa, Nakayasu, Ernesto S, Rich, Stephen S, Metz, Thomas O, Rewers, Marian, Webb-Robertson, Bobbie-Jo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: SAGE Publications 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6630079/
https://www.ncbi.nlm.nih.gov/pubmed/31320812
http://dx.doi.org/10.1177/1179597219858954
_version_ 1783435224162500608
author Stanfill, Bryan
Reehl, Sarah
Bramer, Lisa
Nakayasu, Ernesto S
Rich, Stephen S
Metz, Thomas O
Rewers, Marian
Webb-Robertson, Bobbie-Jo
author_facet Stanfill, Bryan
Reehl, Sarah
Bramer, Lisa
Nakayasu, Ernesto S
Rich, Stephen S
Metz, Thomas O
Rewers, Marian
Webb-Robertson, Bobbie-Jo
author_sort Stanfill, Bryan
collection PubMed
description Classification is a common technique applied to ’omics data to build predictive models and identify potential markers of biomedical outcomes. Despite the prevalence of case-control studies, the number of classification methods available to analyze data generated by such studies is extremely limited. Conditional logistic regression is the most commonly used technique, but the associated modeling assumptions limit its ability to identify a large class of sufficiently complicated ’omic signatures. We propose a data preprocessing step which generalizes and makes any linear or nonlinear classification algorithm, even those typically not appropriate for matched design data, available to be used to model case-control data and identify relevant biomarkers in these study designs. We demonstrate on simulated case-control data that both the classification and variable selection accuracy of each method is improved after applying this processing step and that the proposed methods are comparable to or outperform existing variable selection methods. Finally, we demonstrate the impact of conditional classification algorithms on a large cohort study of children with islet autoimmunity.
format Online
Article
Text
id pubmed-6630079
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher SAGE Publications
record_format MEDLINE/PubMed
spelling pubmed-66300792019-07-18 Extending Classification Algorithms to Case-Control Studies Stanfill, Bryan Reehl, Sarah Bramer, Lisa Nakayasu, Ernesto S Rich, Stephen S Metz, Thomas O Rewers, Marian Webb-Robertson, Bobbie-Jo Biomed Eng Comput Biol Technical Advances Classification is a common technique applied to ’omics data to build predictive models and identify potential markers of biomedical outcomes. Despite the prevalence of case-control studies, the number of classification methods available to analyze data generated by such studies is extremely limited. Conditional logistic regression is the most commonly used technique, but the associated modeling assumptions limit its ability to identify a large class of sufficiently complicated ’omic signatures. We propose a data preprocessing step which generalizes and makes any linear or nonlinear classification algorithm, even those typically not appropriate for matched design data, available to be used to model case-control data and identify relevant biomarkers in these study designs. We demonstrate on simulated case-control data that both the classification and variable selection accuracy of each method is improved after applying this processing step and that the proposed methods are comparable to or outperform existing variable selection methods. Finally, we demonstrate the impact of conditional classification algorithms on a large cohort study of children with islet autoimmunity. SAGE Publications 2019-07-15 /pmc/articles/PMC6630079/ /pubmed/31320812 http://dx.doi.org/10.1177/1179597219858954 Text en © The Author(s) 2019 http://www.creativecommons.org/licenses/by-nc/4.0/ This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (http://www.creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle Technical Advances
Stanfill, Bryan
Reehl, Sarah
Bramer, Lisa
Nakayasu, Ernesto S
Rich, Stephen S
Metz, Thomas O
Rewers, Marian
Webb-Robertson, Bobbie-Jo
Extending Classification Algorithms to Case-Control Studies
title Extending Classification Algorithms to Case-Control Studies
title_full Extending Classification Algorithms to Case-Control Studies
title_fullStr Extending Classification Algorithms to Case-Control Studies
title_full_unstemmed Extending Classification Algorithms to Case-Control Studies
title_short Extending Classification Algorithms to Case-Control Studies
title_sort extending classification algorithms to case-control studies
topic Technical Advances
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6630079/
https://www.ncbi.nlm.nih.gov/pubmed/31320812
http://dx.doi.org/10.1177/1179597219858954
work_keys_str_mv AT stanfillbryan extendingclassificationalgorithmstocasecontrolstudies
AT reehlsarah extendingclassificationalgorithmstocasecontrolstudies
AT bramerlisa extendingclassificationalgorithmstocasecontrolstudies
AT nakayasuernestos extendingclassificationalgorithmstocasecontrolstudies
AT richstephens extendingclassificationalgorithmstocasecontrolstudies
AT metzthomaso extendingclassificationalgorithmstocasecontrolstudies
AT rewersmarian extendingclassificationalgorithmstocasecontrolstudies
AT webbrobertsonbobbiejo extendingclassificationalgorithmstocasecontrolstudies
AT extendingclassificationalgorithmstocasecontrolstudies