Cargando…

New Method Application for Marker-Trait Association Studies in Plants: Partial Least Square Regression Aids Detection of Simultaneous Correlations

In this work, we investigated the suitability of performing partial least square regression (PLSR) on genotype-phenotype datasets to identify marker-trait associations. We utilized data collected on a cotton (Gossypium hirsutum L.) recombinant inbred line (RIL) mapping population that was evaluated...

Descripción completa

Detalles Bibliográficos
Autores principales: Bodah, Eliane Thaines, Weir, Bruce
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6195366/
https://www.ncbi.nlm.nih.gov/pubmed/30345411
http://dx.doi.org/10.19080/ARTOAJ.2017.12.555864
_version_ 1783364380150202368
author Bodah, Eliane Thaines
Weir, Bruce
author_facet Bodah, Eliane Thaines
Weir, Bruce
author_sort Bodah, Eliane Thaines
collection PubMed
description In this work, we investigated the suitability of performing partial least square regression (PLSR) on genotype-phenotype datasets to identify marker-trait associations. We utilized data collected on a cotton (Gossypium hirsutum L.) recombinant inbred line (RIL) mapping population that was evaluated under contrasting irrigation treatments, well-watered and water-limited conditions, in a hot, arid environment in 2012. Two phenotypic data sets were used in combination with the genetic data which consisted of 841 marker loci assigned to 117 linkage groups. The first dataset contained canopy traits that were gathered using a mobile, high-throughput phenotyping platform and included canopy temperature (CT), normalized difference vegetation index (NDVI), and canopy height (CHT) with leaf area index (LAI) being derived from NDVI and CHT measurements. The second phenotypic data set consisted of 14 elemental concentration measurements corresponding to the following elements: P, K, Ca, Mn, Fe, Zn, Ni, Cu, As, Co, Rb, Mo, S, and Mg. To conduct the PSLR analyses we used the “pls” and “pls depot” available in R statistical software version 3.2.4. The PLSR bi plot from the analysis of the first dataset showed that three (LAI, NDVI, and CHT) out of the four canopy traits were highly correlated, and by using multivariate analysis of variance (MANOVA), we detected 22 significant (p<0.01) marker-trait associations for the four traits. In contrast to the canopy trait analysis, our PLSR bi plot for the second dataset showed varying correlations for each of the 14 traits. Because of the lack of distinct trait similarities, MANOVA was not an ideal option to test for marker-trait associations so we implemented a jackknife re sampling technique. Jackknife re sampling failed to detect significant marker effects for several of the 14 elemental concentration traits. Thus, our future work aims to test other re sampling techniques such as boot straping for traits that do not exhibit high correlation. Overall, PLSR was a very informative way to comprehend data structure, displaying correlations within markers, within traits, and between marker and traits in one bi plot. Further studies are still needed to leverage detection of additional variance in correlated datasets and to prevent spurious results. To the best of our knowledge, this is the first time PLSR has been reported in such a context.
format Online
Article
Text
id pubmed-6195366
institution National Center for Biotechnology Information
language English
publishDate 2017
record_format MEDLINE/PubMed
spelling pubmed-61953662018-10-19 New Method Application for Marker-Trait Association Studies in Plants: Partial Least Square Regression Aids Detection of Simultaneous Correlations Bodah, Eliane Thaines Weir, Bruce Agric Res Technol Article In this work, we investigated the suitability of performing partial least square regression (PLSR) on genotype-phenotype datasets to identify marker-trait associations. We utilized data collected on a cotton (Gossypium hirsutum L.) recombinant inbred line (RIL) mapping population that was evaluated under contrasting irrigation treatments, well-watered and water-limited conditions, in a hot, arid environment in 2012. Two phenotypic data sets were used in combination with the genetic data which consisted of 841 marker loci assigned to 117 linkage groups. The first dataset contained canopy traits that were gathered using a mobile, high-throughput phenotyping platform and included canopy temperature (CT), normalized difference vegetation index (NDVI), and canopy height (CHT) with leaf area index (LAI) being derived from NDVI and CHT measurements. The second phenotypic data set consisted of 14 elemental concentration measurements corresponding to the following elements: P, K, Ca, Mn, Fe, Zn, Ni, Cu, As, Co, Rb, Mo, S, and Mg. To conduct the PSLR analyses we used the “pls” and “pls depot” available in R statistical software version 3.2.4. The PLSR bi plot from the analysis of the first dataset showed that three (LAI, NDVI, and CHT) out of the four canopy traits were highly correlated, and by using multivariate analysis of variance (MANOVA), we detected 22 significant (p<0.01) marker-trait associations for the four traits. In contrast to the canopy trait analysis, our PLSR bi plot for the second dataset showed varying correlations for each of the 14 traits. Because of the lack of distinct trait similarities, MANOVA was not an ideal option to test for marker-trait associations so we implemented a jackknife re sampling technique. Jackknife re sampling failed to detect significant marker effects for several of the 14 elemental concentration traits. Thus, our future work aims to test other re sampling techniques such as boot straping for traits that do not exhibit high correlation. Overall, PLSR was a very informative way to comprehend data structure, displaying correlations within markers, within traits, and between marker and traits in one bi plot. Further studies are still needed to leverage detection of additional variance in correlated datasets and to prevent spurious results. To the best of our knowledge, this is the first time PLSR has been reported in such a context. 2017-12-15 2017-12 /pmc/articles/PMC6195366/ /pubmed/30345411 http://dx.doi.org/10.19080/ARTOAJ.2017.12.555864 Text en http://creativecommons.org/licenses/by/4.0/ This work is licensed under Creative Commons Attribution 4.0 License
spellingShingle Article
Bodah, Eliane Thaines
Weir, Bruce
New Method Application for Marker-Trait Association Studies in Plants: Partial Least Square Regression Aids Detection of Simultaneous Correlations
title New Method Application for Marker-Trait Association Studies in Plants: Partial Least Square Regression Aids Detection of Simultaneous Correlations
title_full New Method Application for Marker-Trait Association Studies in Plants: Partial Least Square Regression Aids Detection of Simultaneous Correlations
title_fullStr New Method Application for Marker-Trait Association Studies in Plants: Partial Least Square Regression Aids Detection of Simultaneous Correlations
title_full_unstemmed New Method Application for Marker-Trait Association Studies in Plants: Partial Least Square Regression Aids Detection of Simultaneous Correlations
title_short New Method Application for Marker-Trait Association Studies in Plants: Partial Least Square Regression Aids Detection of Simultaneous Correlations
title_sort new method application for marker-trait association studies in plants: partial least square regression aids detection of simultaneous correlations
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6195366/
https://www.ncbi.nlm.nih.gov/pubmed/30345411
http://dx.doi.org/10.19080/ARTOAJ.2017.12.555864
work_keys_str_mv AT bodahelianethaines newmethodapplicationformarkertraitassociationstudiesinplantspartialleastsquareregressionaidsdetectionofsimultaneouscorrelations
AT weirbruce newmethodapplicationformarkertraitassociationstudiesinplantspartialleastsquareregressionaidsdetectionofsimultaneouscorrelations