Cargando…
MDR-ER: Balancing Functions for Adjusting the Ratio in Risk Classes and Classification Errors for Imbalanced Cases and Controls Using Multifactor-Dimensionality Reduction
BACKGROUND: Determining the complex relationship between diseases, polymorphisms in human genes and environmental factors is challenging. Multifactor dimensionality reduction (MDR) has proven capable of effectively detecting statistical patterns of epistasis. However, MDR has its weakness in accurat...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3827354/ https://www.ncbi.nlm.nih.gov/pubmed/24236125 http://dx.doi.org/10.1371/journal.pone.0079387 |
_version_ | 1782478227522977792 |
---|---|
author | Yang, Cheng-Hong Lin, Yu-Da Chuang, Li-Yeh Chen, Jin-Bor Chang, Hsueh-Wei |
author_facet | Yang, Cheng-Hong Lin, Yu-Da Chuang, Li-Yeh Chen, Jin-Bor Chang, Hsueh-Wei |
author_sort | Yang, Cheng-Hong |
collection | PubMed |
description | BACKGROUND: Determining the complex relationship between diseases, polymorphisms in human genes and environmental factors is challenging. Multifactor dimensionality reduction (MDR) has proven capable of effectively detecting statistical patterns of epistasis. However, MDR has its weakness in accurately assigning multi-locus genotypes to either high-risk and low-risk groups, and does generally not provide accurate error rates when the case and control data sets are imbalanced. Consequently, results for classification error rates and odds ratios (OR) may provide surprising values in that the true positive (TP) value is often small. METHODOLOGY/PRINCIPAL FINDINGS: To address this problem, we introduce a classifier function based on the ratio between the percentage of cases in case data and the percentage of controls in control data to improve MDR (MDR-ER) for multi-locus genotypes to be classified correctly into high-risk and low-risk groups. In this study, a real data set with different ratios of cases to controls (1∶4) was obtained from the mitochondrial D-loop of chronic dialysis patients in order to test MDR-ER. The TP and TN values were collected from all tests to analyze to what degree MDR-ER performed better than MDR. CONCLUSIONS/SIGNIFICANCE: Results showed that MDR-ER can be successfully used to detect the complex associations in imbalanced data sets. |
format | Online Article Text |
id | pubmed-3827354 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-38273542013-11-14 MDR-ER: Balancing Functions for Adjusting the Ratio in Risk Classes and Classification Errors for Imbalanced Cases and Controls Using Multifactor-Dimensionality Reduction Yang, Cheng-Hong Lin, Yu-Da Chuang, Li-Yeh Chen, Jin-Bor Chang, Hsueh-Wei PLoS One Research Article BACKGROUND: Determining the complex relationship between diseases, polymorphisms in human genes and environmental factors is challenging. Multifactor dimensionality reduction (MDR) has proven capable of effectively detecting statistical patterns of epistasis. However, MDR has its weakness in accurately assigning multi-locus genotypes to either high-risk and low-risk groups, and does generally not provide accurate error rates when the case and control data sets are imbalanced. Consequently, results for classification error rates and odds ratios (OR) may provide surprising values in that the true positive (TP) value is often small. METHODOLOGY/PRINCIPAL FINDINGS: To address this problem, we introduce a classifier function based on the ratio between the percentage of cases in case data and the percentage of controls in control data to improve MDR (MDR-ER) for multi-locus genotypes to be classified correctly into high-risk and low-risk groups. In this study, a real data set with different ratios of cases to controls (1∶4) was obtained from the mitochondrial D-loop of chronic dialysis patients in order to test MDR-ER. The TP and TN values were collected from all tests to analyze to what degree MDR-ER performed better than MDR. CONCLUSIONS/SIGNIFICANCE: Results showed that MDR-ER can be successfully used to detect the complex associations in imbalanced data sets. Public Library of Science 2013-11-13 /pmc/articles/PMC3827354/ /pubmed/24236125 http://dx.doi.org/10.1371/journal.pone.0079387 Text en © 2013 Yang et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Yang, Cheng-Hong Lin, Yu-Da Chuang, Li-Yeh Chen, Jin-Bor Chang, Hsueh-Wei MDR-ER: Balancing Functions for Adjusting the Ratio in Risk Classes and Classification Errors for Imbalanced Cases and Controls Using Multifactor-Dimensionality Reduction |
title | MDR-ER: Balancing Functions for Adjusting the Ratio in Risk Classes and Classification Errors for Imbalanced Cases and Controls Using Multifactor-Dimensionality Reduction |
title_full | MDR-ER: Balancing Functions for Adjusting the Ratio in Risk Classes and Classification Errors for Imbalanced Cases and Controls Using Multifactor-Dimensionality Reduction |
title_fullStr | MDR-ER: Balancing Functions for Adjusting the Ratio in Risk Classes and Classification Errors for Imbalanced Cases and Controls Using Multifactor-Dimensionality Reduction |
title_full_unstemmed | MDR-ER: Balancing Functions for Adjusting the Ratio in Risk Classes and Classification Errors for Imbalanced Cases and Controls Using Multifactor-Dimensionality Reduction |
title_short | MDR-ER: Balancing Functions for Adjusting the Ratio in Risk Classes and Classification Errors for Imbalanced Cases and Controls Using Multifactor-Dimensionality Reduction |
title_sort | mdr-er: balancing functions for adjusting the ratio in risk classes and classification errors for imbalanced cases and controls using multifactor-dimensionality reduction |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3827354/ https://www.ncbi.nlm.nih.gov/pubmed/24236125 http://dx.doi.org/10.1371/journal.pone.0079387 |
work_keys_str_mv | AT yangchenghong mdrerbalancingfunctionsforadjustingtheratioinriskclassesandclassificationerrorsforimbalancedcasesandcontrolsusingmultifactordimensionalityreduction AT linyuda mdrerbalancingfunctionsforadjustingtheratioinriskclassesandclassificationerrorsforimbalancedcasesandcontrolsusingmultifactordimensionalityreduction AT chuangliyeh mdrerbalancingfunctionsforadjustingtheratioinriskclassesandclassificationerrorsforimbalancedcasesandcontrolsusingmultifactordimensionalityreduction AT chenjinbor mdrerbalancingfunctionsforadjustingtheratioinriskclassesandclassificationerrorsforimbalancedcasesandcontrolsusingmultifactordimensionalityreduction AT changhsuehwei mdrerbalancingfunctionsforadjustingtheratioinriskclassesandclassificationerrorsforimbalancedcasesandcontrolsusingmultifactordimensionalityreduction |