Cargando…

Quantitative Analysis of a Weak Correlation between Complicated Data on the Basis of Principal Component Analysis

The mining of weak correlation information between two data matrices with high complexity is a very challenging task. A new method named principal component analysis-based multiconfidence ellipse analysis (PCA/MCEA) was proposed in this study, which first applied a confidence ellipse to describe the...

Descripción completa

Detalles Bibliográficos
Autores principales: Pang, Tao, Zhang, Haitao, Wen, Liliang, Tang, Jun, Zhou, Bing, Yang, Qianxu, Li, Yong, Wang, Jiajun, Chen, Aiming, Zeng, Zhongda
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7843181/
https://www.ncbi.nlm.nih.gov/pubmed/33542846
http://dx.doi.org/10.1155/2021/8874827
_version_ 1783644095414009856
author Pang, Tao
Zhang, Haitao
Wen, Liliang
Tang, Jun
Zhou, Bing
Yang, Qianxu
Li, Yong
Wang, Jiajun
Chen, Aiming
Zeng, Zhongda
author_facet Pang, Tao
Zhang, Haitao
Wen, Liliang
Tang, Jun
Zhou, Bing
Yang, Qianxu
Li, Yong
Wang, Jiajun
Chen, Aiming
Zeng, Zhongda
author_sort Pang, Tao
collection PubMed
description The mining of weak correlation information between two data matrices with high complexity is a very challenging task. A new method named principal component analysis-based multiconfidence ellipse analysis (PCA/MCEA) was proposed in this study, which first applied a confidence ellipse to describe the difference and correlation of such information among different categories of objects/samples on the basis of PCA operation of a single targeted data. This helps to find the number of objects contained in the overlapping and nonoverlapping areas of ellipses obtained from PCA runs. Then, a quantitative evaluation index of correlation between data matrices was defined by comparing the PCA results of more than one data matrix. The similarity and difference between data matrices was further quantified through comprehensively analyzing the outcomes. Complicated data of tobacco agriculture were used as an example to illustrate the strategy of the proposed method, which includes rich features of climate, altitude, and chemical compositions of tobacco leaves. The number of objects of these data reached 171,516 with 14, 4, and 5 descriptors of climate, altitude, and chemicals, respectively. On the basis of the new method, the complex but weak relationship between these independent and dependent variables were interestingly studied. Three widely used but conventional methods were applied for comparison in this work. The results showed the power of the new method to discover the weak correlation between complicated data.
format Online
Article
Text
id pubmed-7843181
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-78431812021-02-03 Quantitative Analysis of a Weak Correlation between Complicated Data on the Basis of Principal Component Analysis Pang, Tao Zhang, Haitao Wen, Liliang Tang, Jun Zhou, Bing Yang, Qianxu Li, Yong Wang, Jiajun Chen, Aiming Zeng, Zhongda J Anal Methods Chem Research Article The mining of weak correlation information between two data matrices with high complexity is a very challenging task. A new method named principal component analysis-based multiconfidence ellipse analysis (PCA/MCEA) was proposed in this study, which first applied a confidence ellipse to describe the difference and correlation of such information among different categories of objects/samples on the basis of PCA operation of a single targeted data. This helps to find the number of objects contained in the overlapping and nonoverlapping areas of ellipses obtained from PCA runs. Then, a quantitative evaluation index of correlation between data matrices was defined by comparing the PCA results of more than one data matrix. The similarity and difference between data matrices was further quantified through comprehensively analyzing the outcomes. Complicated data of tobacco agriculture were used as an example to illustrate the strategy of the proposed method, which includes rich features of climate, altitude, and chemical compositions of tobacco leaves. The number of objects of these data reached 171,516 with 14, 4, and 5 descriptors of climate, altitude, and chemicals, respectively. On the basis of the new method, the complex but weak relationship between these independent and dependent variables were interestingly studied. Three widely used but conventional methods were applied for comparison in this work. The results showed the power of the new method to discover the weak correlation between complicated data. Hindawi 2021-01-20 /pmc/articles/PMC7843181/ /pubmed/33542846 http://dx.doi.org/10.1155/2021/8874827 Text en Copyright © 2021 Tao Pang et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Pang, Tao
Zhang, Haitao
Wen, Liliang
Tang, Jun
Zhou, Bing
Yang, Qianxu
Li, Yong
Wang, Jiajun
Chen, Aiming
Zeng, Zhongda
Quantitative Analysis of a Weak Correlation between Complicated Data on the Basis of Principal Component Analysis
title Quantitative Analysis of a Weak Correlation between Complicated Data on the Basis of Principal Component Analysis
title_full Quantitative Analysis of a Weak Correlation between Complicated Data on the Basis of Principal Component Analysis
title_fullStr Quantitative Analysis of a Weak Correlation between Complicated Data on the Basis of Principal Component Analysis
title_full_unstemmed Quantitative Analysis of a Weak Correlation between Complicated Data on the Basis of Principal Component Analysis
title_short Quantitative Analysis of a Weak Correlation between Complicated Data on the Basis of Principal Component Analysis
title_sort quantitative analysis of a weak correlation between complicated data on the basis of principal component analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7843181/
https://www.ncbi.nlm.nih.gov/pubmed/33542846
http://dx.doi.org/10.1155/2021/8874827
work_keys_str_mv AT pangtao quantitativeanalysisofaweakcorrelationbetweencomplicateddataonthebasisofprincipalcomponentanalysis
AT zhanghaitao quantitativeanalysisofaweakcorrelationbetweencomplicateddataonthebasisofprincipalcomponentanalysis
AT wenliliang quantitativeanalysisofaweakcorrelationbetweencomplicateddataonthebasisofprincipalcomponentanalysis
AT tangjun quantitativeanalysisofaweakcorrelationbetweencomplicateddataonthebasisofprincipalcomponentanalysis
AT zhoubing quantitativeanalysisofaweakcorrelationbetweencomplicateddataonthebasisofprincipalcomponentanalysis
AT yangqianxu quantitativeanalysisofaweakcorrelationbetweencomplicateddataonthebasisofprincipalcomponentanalysis
AT liyong quantitativeanalysisofaweakcorrelationbetweencomplicateddataonthebasisofprincipalcomponentanalysis
AT wangjiajun quantitativeanalysisofaweakcorrelationbetweencomplicateddataonthebasisofprincipalcomponentanalysis
AT chenaiming quantitativeanalysisofaweakcorrelationbetweencomplicateddataonthebasisofprincipalcomponentanalysis
AT zengzhongda quantitativeanalysisofaweakcorrelationbetweencomplicateddataonthebasisofprincipalcomponentanalysis