Cargando…

On Information Rank Deficiency in Phenotypic Covariance Matrices

This article investigates a form of rank deficiency in phenotypic covariance matrices derived from geometric morphometric data, and its impact on measures of phenotypic integration. We first define a type of rank deficiency based on information theory then demonstrate that this deficiency impairs th...

Descripción completa

Detalles Bibliográficos
Autores principales: O’Keefe, F Robin, Meachen, Julie A, Polly, P David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9203068/
https://www.ncbi.nlm.nih.gov/pubmed/34735008
http://dx.doi.org/10.1093/sysbio/syab088
_version_ 1784728656551084032
author O’Keefe, F Robin
Meachen, Julie A
Polly, P David
author_facet O’Keefe, F Robin
Meachen, Julie A
Polly, P David
author_sort O’Keefe, F Robin
collection PubMed
description This article investigates a form of rank deficiency in phenotypic covariance matrices derived from geometric morphometric data, and its impact on measures of phenotypic integration. We first define a type of rank deficiency based on information theory then demonstrate that this deficiency impairs the performance of phenotypic integration metrics in a model system. Lastly, we propose methods to treat for this information rank deficiency. Our first goal is to establish how the rank of a typical geometric morphometric covariance matrix relates to the information entropy of its eigenvalue spectrum. This requires clear definitions of matrix rank, of which we define three: the full matrix rank (equal to the number of input variables), the mathematical rank (the number of nonzero eigenvalues), and the information rank or “effective rank” (equal to the number of nonredundant eigenvalues). We demonstrate that effective rank deficiency arises from a combination of methodological factors—Generalized Procrustes analysis, use of the correlation matrix, and insufficient sample size—as well as phenotypic covariance. Secondly, we use dire wolf jaws to document how differences in effective rank deficiency bias two metrics used to measure phenotypic integration. The eigenvalue variance characterizes the integration change incorrectly, and the standardized generalized variance lacks the sensitivity needed to detect subtle changes in integration. Both metrics are impacted by the inclusion of many small, but nonzero, eigenvalues arising from a lack of information in the covariance matrix, a problem that usually becomes more pronounced as the number of landmarks increases. We propose a new metric for phenotypic integration that combines the standardized generalized variance with information entropy. This metric is equivalent to the standardized generalized variance but calculated only from those eigenvalues that carry nonredundant information. It is the standardized generalized variance scaled to the effective rank of the eigenvalue spectrum. We demonstrate that this metric successfully detects the shift of integration in our dire wolf sample. Our third goal is to generalize the new metric to compare data sets with different sample sizes and numbers of variables. We develop a standardization for matrix information based on data permutation then demonstrate that Smilodon jaws are more integrated than dire wolf jaws. Finally, we describe how our information entropy-based measure allows phenotypic integration to be compared in dense semilandmark data sets without bias, allowing characterization of the information content of any given shape, a quantity we term “latent dispersion”. [Canis dirus; Dire wolf; effective dispersion; effective rank; geometric morphometrics; information entropy; latent dispersion; modularity and integration; phenotypic integration; relative dispersion.]
format Online
Article
Text
id pubmed-9203068
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-92030682022-06-21 On Information Rank Deficiency in Phenotypic Covariance Matrices O’Keefe, F Robin Meachen, Julie A Polly, P David Syst Biol Regular Articles This article investigates a form of rank deficiency in phenotypic covariance matrices derived from geometric morphometric data, and its impact on measures of phenotypic integration. We first define a type of rank deficiency based on information theory then demonstrate that this deficiency impairs the performance of phenotypic integration metrics in a model system. Lastly, we propose methods to treat for this information rank deficiency. Our first goal is to establish how the rank of a typical geometric morphometric covariance matrix relates to the information entropy of its eigenvalue spectrum. This requires clear definitions of matrix rank, of which we define three: the full matrix rank (equal to the number of input variables), the mathematical rank (the number of nonzero eigenvalues), and the information rank or “effective rank” (equal to the number of nonredundant eigenvalues). We demonstrate that effective rank deficiency arises from a combination of methodological factors—Generalized Procrustes analysis, use of the correlation matrix, and insufficient sample size—as well as phenotypic covariance. Secondly, we use dire wolf jaws to document how differences in effective rank deficiency bias two metrics used to measure phenotypic integration. The eigenvalue variance characterizes the integration change incorrectly, and the standardized generalized variance lacks the sensitivity needed to detect subtle changes in integration. Both metrics are impacted by the inclusion of many small, but nonzero, eigenvalues arising from a lack of information in the covariance matrix, a problem that usually becomes more pronounced as the number of landmarks increases. We propose a new metric for phenotypic integration that combines the standardized generalized variance with information entropy. This metric is equivalent to the standardized generalized variance but calculated only from those eigenvalues that carry nonredundant information. It is the standardized generalized variance scaled to the effective rank of the eigenvalue spectrum. We demonstrate that this metric successfully detects the shift of integration in our dire wolf sample. Our third goal is to generalize the new metric to compare data sets with different sample sizes and numbers of variables. We develop a standardization for matrix information based on data permutation then demonstrate that Smilodon jaws are more integrated than dire wolf jaws. Finally, we describe how our information entropy-based measure allows phenotypic integration to be compared in dense semilandmark data sets without bias, allowing characterization of the information content of any given shape, a quantity we term “latent dispersion”. [Canis dirus; Dire wolf; effective dispersion; effective rank; geometric morphometrics; information entropy; latent dispersion; modularity and integration; phenotypic integration; relative dispersion.] Oxford University Press 2021-11-04 /pmc/articles/PMC9203068/ /pubmed/34735008 http://dx.doi.org/10.1093/sysbio/syab088 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of the Society of Systematic Biologists. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Regular Articles
O’Keefe, F Robin
Meachen, Julie A
Polly, P David
On Information Rank Deficiency in Phenotypic Covariance Matrices
title On Information Rank Deficiency in Phenotypic Covariance Matrices
title_full On Information Rank Deficiency in Phenotypic Covariance Matrices
title_fullStr On Information Rank Deficiency in Phenotypic Covariance Matrices
title_full_unstemmed On Information Rank Deficiency in Phenotypic Covariance Matrices
title_short On Information Rank Deficiency in Phenotypic Covariance Matrices
title_sort on information rank deficiency in phenotypic covariance matrices
topic Regular Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9203068/
https://www.ncbi.nlm.nih.gov/pubmed/34735008
http://dx.doi.org/10.1093/sysbio/syab088
work_keys_str_mv AT okeefefrobin oninformationrankdeficiencyinphenotypiccovariancematrices
AT meachenjuliea oninformationrankdeficiencyinphenotypiccovariancematrices
AT pollypdavid oninformationrankdeficiencyinphenotypiccovariancematrices