Cargando…

An introduction to new robust linear and monotonic correlation coefficients

BACKGROUND: The most common measure of association between two continuous variables is the Pearson correlation (Maronna et al. in Safari an OMC. Robust statistics, 2019. https://login.proxy.bib.uottawa.ca/login?url=https://learning.oreilly.com/library/view/-/9781119214687/?ar&orpq&email=^u)....

Descripción completa

Detalles Bibliográficos
Autores principales: Tabatabai, Mohammad, Bailey, Stephanie, Bursac, Zoran, Tabatabai, Habib, Wilus, Derek, Singh, Karan P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8011137/
https://www.ncbi.nlm.nih.gov/pubmed/33789571
http://dx.doi.org/10.1186/s12859-021-04098-4
_version_ 1783673187500818432
author Tabatabai, Mohammad
Bailey, Stephanie
Bursac, Zoran
Tabatabai, Habib
Wilus, Derek
Singh, Karan P.
author_facet Tabatabai, Mohammad
Bailey, Stephanie
Bursac, Zoran
Tabatabai, Habib
Wilus, Derek
Singh, Karan P.
author_sort Tabatabai, Mohammad
collection PubMed
description BACKGROUND: The most common measure of association between two continuous variables is the Pearson correlation (Maronna et al. in Safari an OMC. Robust statistics, 2019. https://login.proxy.bib.uottawa.ca/login?url=https://learning.oreilly.com/library/view/-/9781119214687/?ar&orpq&email=^u). When outliers are present, Pearson does not accurately measure association and robust measures are needed. This article introduces three new robust measures of correlation: Taba (T), TabWil (TW), and TabWil rank (TWR). The correlation estimators T and TW measure a linear association between two continuous or ordinal variables; whereas TWR measures a monotonic association. The robustness of these proposed measures in comparison with Pearson (P), Spearman (S), Quadrant (Q), Median (M), and Minimum Covariance Determinant (MCD) are examined through simulation. Taba distance is used to analyze genes, and statistical tests were used to identify those genes most significantly associated with Williams Syndrome (WS). RESULTS: Based on the root mean square error (RMSE) and bias, the three proposed correlation measures are highly competitive when compared to classical measures such as P and S as well as robust measures such as Q, M, and MCD. Our findings indicate TBL2 was the most significant gene among patients diagnosed with WS and had the most significant reduction in gene expression level when compared with control (P value = 6.37E-05). CONCLUSIONS: Overall, when the distribution is bivariate Log-Normal or bivariate Weibull, TWR performs best in terms of bias and T performs best with respect to RMSE. Under the Normal distribution, MCD performs well with respect to bias and RMSE; but TW, TWR, T, S, and P correlations were in close proximity. The identification of TBL2 may serve as a diagnostic tool for WS patients. A Taba R package has been developed and is available for use to perform all necessary computations for the proposed methods. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04098-4.
format Online
Article
Text
id pubmed-8011137
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-80111372021-03-31 An introduction to new robust linear and monotonic correlation coefficients Tabatabai, Mohammad Bailey, Stephanie Bursac, Zoran Tabatabai, Habib Wilus, Derek Singh, Karan P. BMC Bioinformatics Methodology Article BACKGROUND: The most common measure of association between two continuous variables is the Pearson correlation (Maronna et al. in Safari an OMC. Robust statistics, 2019. https://login.proxy.bib.uottawa.ca/login?url=https://learning.oreilly.com/library/view/-/9781119214687/?ar&orpq&email=^u). When outliers are present, Pearson does not accurately measure association and robust measures are needed. This article introduces three new robust measures of correlation: Taba (T), TabWil (TW), and TabWil rank (TWR). The correlation estimators T and TW measure a linear association between two continuous or ordinal variables; whereas TWR measures a monotonic association. The robustness of these proposed measures in comparison with Pearson (P), Spearman (S), Quadrant (Q), Median (M), and Minimum Covariance Determinant (MCD) are examined through simulation. Taba distance is used to analyze genes, and statistical tests were used to identify those genes most significantly associated with Williams Syndrome (WS). RESULTS: Based on the root mean square error (RMSE) and bias, the three proposed correlation measures are highly competitive when compared to classical measures such as P and S as well as robust measures such as Q, M, and MCD. Our findings indicate TBL2 was the most significant gene among patients diagnosed with WS and had the most significant reduction in gene expression level when compared with control (P value = 6.37E-05). CONCLUSIONS: Overall, when the distribution is bivariate Log-Normal or bivariate Weibull, TWR performs best in terms of bias and T performs best with respect to RMSE. Under the Normal distribution, MCD performs well with respect to bias and RMSE; but TW, TWR, T, S, and P correlations were in close proximity. The identification of TBL2 may serve as a diagnostic tool for WS patients. A Taba R package has been developed and is available for use to perform all necessary computations for the proposed methods. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04098-4. BioMed Central 2021-03-31 /pmc/articles/PMC8011137/ /pubmed/33789571 http://dx.doi.org/10.1186/s12859-021-04098-4 Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Tabatabai, Mohammad
Bailey, Stephanie
Bursac, Zoran
Tabatabai, Habib
Wilus, Derek
Singh, Karan P.
An introduction to new robust linear and monotonic correlation coefficients
title An introduction to new robust linear and monotonic correlation coefficients
title_full An introduction to new robust linear and monotonic correlation coefficients
title_fullStr An introduction to new robust linear and monotonic correlation coefficients
title_full_unstemmed An introduction to new robust linear and monotonic correlation coefficients
title_short An introduction to new robust linear and monotonic correlation coefficients
title_sort introduction to new robust linear and monotonic correlation coefficients
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8011137/
https://www.ncbi.nlm.nih.gov/pubmed/33789571
http://dx.doi.org/10.1186/s12859-021-04098-4
work_keys_str_mv AT tabatabaimohammad anintroductiontonewrobustlinearandmonotoniccorrelationcoefficients
AT baileystephanie anintroductiontonewrobustlinearandmonotoniccorrelationcoefficients
AT bursaczoran anintroductiontonewrobustlinearandmonotoniccorrelationcoefficients
AT tabatabaihabib anintroductiontonewrobustlinearandmonotoniccorrelationcoefficients
AT wilusderek anintroductiontonewrobustlinearandmonotoniccorrelationcoefficients
AT singhkaranp anintroductiontonewrobustlinearandmonotoniccorrelationcoefficients
AT tabatabaimohammad introductiontonewrobustlinearandmonotoniccorrelationcoefficients
AT baileystephanie introductiontonewrobustlinearandmonotoniccorrelationcoefficients
AT bursaczoran introductiontonewrobustlinearandmonotoniccorrelationcoefficients
AT tabatabaihabib introductiontonewrobustlinearandmonotoniccorrelationcoefficients
AT wilusderek introductiontonewrobustlinearandmonotoniccorrelationcoefficients
AT singhkaranp introductiontonewrobustlinearandmonotoniccorrelationcoefficients