Cargando…
Application of Multiple Imputation for Missing Values in Three-Way Three-Mode Multi-Environment Trial Data
It is a common occurrence in plant breeding programs to observe missing values in three-way three-mode multi-environment trial (MET) data. We proposed modifications of models for estimating missing observations for these data arrays, and developed a novel approach in terms of hierarchical clustering...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4686903/ https://www.ncbi.nlm.nih.gov/pubmed/26689369 http://dx.doi.org/10.1371/journal.pone.0144370 |
_version_ | 1782406522013220864 |
---|---|
author | Tian, Ting McLachlan, Geoffrey J. Dieters, Mark J. Basford, Kaye E. |
author_facet | Tian, Ting McLachlan, Geoffrey J. Dieters, Mark J. Basford, Kaye E. |
author_sort | Tian, Ting |
collection | PubMed |
description | It is a common occurrence in plant breeding programs to observe missing values in three-way three-mode multi-environment trial (MET) data. We proposed modifications of models for estimating missing observations for these data arrays, and developed a novel approach in terms of hierarchical clustering. Multiple imputation (MI) was used in four ways, multiple agglomerative hierarchical clustering, normal distribution model, normal regression model, and predictive mean match. The later three models used both Bayesian analysis and non-Bayesian analysis, while the first approach used a clustering procedure with randomly selected attributes and assigned real values from the nearest neighbour to the one with missing observations. Different proportions of data entries in six complete datasets were randomly selected to be missing and the MI methods were compared based on the efficiency and accuracy of estimating those values. The results indicated that the models using Bayesian analysis had slightly higher accuracy of estimation performance than those using non-Bayesian analysis but they were more time-consuming. However, the novel approach of multiple agglomerative hierarchical clustering demonstrated the overall best performances. |
format | Online Article Text |
id | pubmed-4686903 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-46869032016-01-07 Application of Multiple Imputation for Missing Values in Three-Way Three-Mode Multi-Environment Trial Data Tian, Ting McLachlan, Geoffrey J. Dieters, Mark J. Basford, Kaye E. PLoS One Research Article It is a common occurrence in plant breeding programs to observe missing values in three-way three-mode multi-environment trial (MET) data. We proposed modifications of models for estimating missing observations for these data arrays, and developed a novel approach in terms of hierarchical clustering. Multiple imputation (MI) was used in four ways, multiple agglomerative hierarchical clustering, normal distribution model, normal regression model, and predictive mean match. The later three models used both Bayesian analysis and non-Bayesian analysis, while the first approach used a clustering procedure with randomly selected attributes and assigned real values from the nearest neighbour to the one with missing observations. Different proportions of data entries in six complete datasets were randomly selected to be missing and the MI methods were compared based on the efficiency and accuracy of estimating those values. The results indicated that the models using Bayesian analysis had slightly higher accuracy of estimation performance than those using non-Bayesian analysis but they were more time-consuming. However, the novel approach of multiple agglomerative hierarchical clustering demonstrated the overall best performances. Public Library of Science 2015-12-21 /pmc/articles/PMC4686903/ /pubmed/26689369 http://dx.doi.org/10.1371/journal.pone.0144370 Text en © 2015 Tian et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Tian, Ting McLachlan, Geoffrey J. Dieters, Mark J. Basford, Kaye E. Application of Multiple Imputation for Missing Values in Three-Way Three-Mode Multi-Environment Trial Data |
title | Application of Multiple Imputation for Missing Values in Three-Way Three-Mode Multi-Environment Trial Data |
title_full | Application of Multiple Imputation for Missing Values in Three-Way Three-Mode Multi-Environment Trial Data |
title_fullStr | Application of Multiple Imputation for Missing Values in Three-Way Three-Mode Multi-Environment Trial Data |
title_full_unstemmed | Application of Multiple Imputation for Missing Values in Three-Way Three-Mode Multi-Environment Trial Data |
title_short | Application of Multiple Imputation for Missing Values in Three-Way Three-Mode Multi-Environment Trial Data |
title_sort | application of multiple imputation for missing values in three-way three-mode multi-environment trial data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4686903/ https://www.ncbi.nlm.nih.gov/pubmed/26689369 http://dx.doi.org/10.1371/journal.pone.0144370 |
work_keys_str_mv | AT tianting applicationofmultipleimputationformissingvaluesinthreewaythreemodemultienvironmenttrialdata AT mclachlangeoffreyj applicationofmultipleimputationformissingvaluesinthreewaythreemodemultienvironmenttrialdata AT dietersmarkj applicationofmultipleimputationformissingvaluesinthreewaythreemodemultienvironmenttrialdata AT basfordkayee applicationofmultipleimputationformissingvaluesinthreewaythreemodemultienvironmenttrialdata |