Cargando…
An introduction to new robust linear and monotonic correlation coefficients
BACKGROUND: The most common measure of association between two continuous variables is the Pearson correlation (Maronna et al. in Safari an OMC. Robust statistics, 2019. https://login.proxy.bib.uottawa.ca/login?url=https://learning.oreilly.com/library/view/-/9781119214687/?ar&orpq&email=^u)....
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8011137/ https://www.ncbi.nlm.nih.gov/pubmed/33789571 http://dx.doi.org/10.1186/s12859-021-04098-4 |
_version_ | 1783673187500818432 |
---|---|
author | Tabatabai, Mohammad Bailey, Stephanie Bursac, Zoran Tabatabai, Habib Wilus, Derek Singh, Karan P. |
author_facet | Tabatabai, Mohammad Bailey, Stephanie Bursac, Zoran Tabatabai, Habib Wilus, Derek Singh, Karan P. |
author_sort | Tabatabai, Mohammad |
collection | PubMed |
description | BACKGROUND: The most common measure of association between two continuous variables is the Pearson correlation (Maronna et al. in Safari an OMC. Robust statistics, 2019. https://login.proxy.bib.uottawa.ca/login?url=https://learning.oreilly.com/library/view/-/9781119214687/?ar&orpq&email=^u). When outliers are present, Pearson does not accurately measure association and robust measures are needed. This article introduces three new robust measures of correlation: Taba (T), TabWil (TW), and TabWil rank (TWR). The correlation estimators T and TW measure a linear association between two continuous or ordinal variables; whereas TWR measures a monotonic association. The robustness of these proposed measures in comparison with Pearson (P), Spearman (S), Quadrant (Q), Median (M), and Minimum Covariance Determinant (MCD) are examined through simulation. Taba distance is used to analyze genes, and statistical tests were used to identify those genes most significantly associated with Williams Syndrome (WS). RESULTS: Based on the root mean square error (RMSE) and bias, the three proposed correlation measures are highly competitive when compared to classical measures such as P and S as well as robust measures such as Q, M, and MCD. Our findings indicate TBL2 was the most significant gene among patients diagnosed with WS and had the most significant reduction in gene expression level when compared with control (P value = 6.37E-05). CONCLUSIONS: Overall, when the distribution is bivariate Log-Normal or bivariate Weibull, TWR performs best in terms of bias and T performs best with respect to RMSE. Under the Normal distribution, MCD performs well with respect to bias and RMSE; but TW, TWR, T, S, and P correlations were in close proximity. The identification of TBL2 may serve as a diagnostic tool for WS patients. A Taba R package has been developed and is available for use to perform all necessary computations for the proposed methods. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04098-4. |
format | Online Article Text |
id | pubmed-8011137 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-80111372021-03-31 An introduction to new robust linear and monotonic correlation coefficients Tabatabai, Mohammad Bailey, Stephanie Bursac, Zoran Tabatabai, Habib Wilus, Derek Singh, Karan P. BMC Bioinformatics Methodology Article BACKGROUND: The most common measure of association between two continuous variables is the Pearson correlation (Maronna et al. in Safari an OMC. Robust statistics, 2019. https://login.proxy.bib.uottawa.ca/login?url=https://learning.oreilly.com/library/view/-/9781119214687/?ar&orpq&email=^u). When outliers are present, Pearson does not accurately measure association and robust measures are needed. This article introduces three new robust measures of correlation: Taba (T), TabWil (TW), and TabWil rank (TWR). The correlation estimators T and TW measure a linear association between two continuous or ordinal variables; whereas TWR measures a monotonic association. The robustness of these proposed measures in comparison with Pearson (P), Spearman (S), Quadrant (Q), Median (M), and Minimum Covariance Determinant (MCD) are examined through simulation. Taba distance is used to analyze genes, and statistical tests were used to identify those genes most significantly associated with Williams Syndrome (WS). RESULTS: Based on the root mean square error (RMSE) and bias, the three proposed correlation measures are highly competitive when compared to classical measures such as P and S as well as robust measures such as Q, M, and MCD. Our findings indicate TBL2 was the most significant gene among patients diagnosed with WS and had the most significant reduction in gene expression level when compared with control (P value = 6.37E-05). CONCLUSIONS: Overall, when the distribution is bivariate Log-Normal or bivariate Weibull, TWR performs best in terms of bias and T performs best with respect to RMSE. Under the Normal distribution, MCD performs well with respect to bias and RMSE; but TW, TWR, T, S, and P correlations were in close proximity. The identification of TBL2 may serve as a diagnostic tool for WS patients. A Taba R package has been developed and is available for use to perform all necessary computations for the proposed methods. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04098-4. BioMed Central 2021-03-31 /pmc/articles/PMC8011137/ /pubmed/33789571 http://dx.doi.org/10.1186/s12859-021-04098-4 Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Methodology Article Tabatabai, Mohammad Bailey, Stephanie Bursac, Zoran Tabatabai, Habib Wilus, Derek Singh, Karan P. An introduction to new robust linear and monotonic correlation coefficients |
title | An introduction to new robust linear and monotonic correlation coefficients |
title_full | An introduction to new robust linear and monotonic correlation coefficients |
title_fullStr | An introduction to new robust linear and monotonic correlation coefficients |
title_full_unstemmed | An introduction to new robust linear and monotonic correlation coefficients |
title_short | An introduction to new robust linear and monotonic correlation coefficients |
title_sort | introduction to new robust linear and monotonic correlation coefficients |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8011137/ https://www.ncbi.nlm.nih.gov/pubmed/33789571 http://dx.doi.org/10.1186/s12859-021-04098-4 |
work_keys_str_mv | AT tabatabaimohammad anintroductiontonewrobustlinearandmonotoniccorrelationcoefficients AT baileystephanie anintroductiontonewrobustlinearandmonotoniccorrelationcoefficients AT bursaczoran anintroductiontonewrobustlinearandmonotoniccorrelationcoefficients AT tabatabaihabib anintroductiontonewrobustlinearandmonotoniccorrelationcoefficients AT wilusderek anintroductiontonewrobustlinearandmonotoniccorrelationcoefficients AT singhkaranp anintroductiontonewrobustlinearandmonotoniccorrelationcoefficients AT tabatabaimohammad introductiontonewrobustlinearandmonotoniccorrelationcoefficients AT baileystephanie introductiontonewrobustlinearandmonotoniccorrelationcoefficients AT bursaczoran introductiontonewrobustlinearandmonotoniccorrelationcoefficients AT tabatabaihabib introductiontonewrobustlinearandmonotoniccorrelationcoefficients AT wilusderek introductiontonewrobustlinearandmonotoniccorrelationcoefficients AT singhkaranp introductiontonewrobustlinearandmonotoniccorrelationcoefficients |