Cargando…

Application of Multiple Imputation for Missing Values in Three-Way Three-Mode Multi-Environment Trial Data

It is a common occurrence in plant breeding programs to observe missing values in three-way three-mode multi-environment trial (MET) data. We proposed modifications of models for estimating missing observations for these data arrays, and developed a novel approach in terms of hierarchical clustering...

Descripción completa

Detalles Bibliográficos
Autores principales: Tian, Ting, McLachlan, Geoffrey J., Dieters, Mark J., Basford, Kaye E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4686903/
https://www.ncbi.nlm.nih.gov/pubmed/26689369
http://dx.doi.org/10.1371/journal.pone.0144370
_version_ 1782406522013220864
author Tian, Ting
McLachlan, Geoffrey J.
Dieters, Mark J.
Basford, Kaye E.
author_facet Tian, Ting
McLachlan, Geoffrey J.
Dieters, Mark J.
Basford, Kaye E.
author_sort Tian, Ting
collection PubMed
description It is a common occurrence in plant breeding programs to observe missing values in three-way three-mode multi-environment trial (MET) data. We proposed modifications of models for estimating missing observations for these data arrays, and developed a novel approach in terms of hierarchical clustering. Multiple imputation (MI) was used in four ways, multiple agglomerative hierarchical clustering, normal distribution model, normal regression model, and predictive mean match. The later three models used both Bayesian analysis and non-Bayesian analysis, while the first approach used a clustering procedure with randomly selected attributes and assigned real values from the nearest neighbour to the one with missing observations. Different proportions of data entries in six complete datasets were randomly selected to be missing and the MI methods were compared based on the efficiency and accuracy of estimating those values. The results indicated that the models using Bayesian analysis had slightly higher accuracy of estimation performance than those using non-Bayesian analysis but they were more time-consuming. However, the novel approach of multiple agglomerative hierarchical clustering demonstrated the overall best performances.
format Online
Article
Text
id pubmed-4686903
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-46869032016-01-07 Application of Multiple Imputation for Missing Values in Three-Way Three-Mode Multi-Environment Trial Data Tian, Ting McLachlan, Geoffrey J. Dieters, Mark J. Basford, Kaye E. PLoS One Research Article It is a common occurrence in plant breeding programs to observe missing values in three-way three-mode multi-environment trial (MET) data. We proposed modifications of models for estimating missing observations for these data arrays, and developed a novel approach in terms of hierarchical clustering. Multiple imputation (MI) was used in four ways, multiple agglomerative hierarchical clustering, normal distribution model, normal regression model, and predictive mean match. The later three models used both Bayesian analysis and non-Bayesian analysis, while the first approach used a clustering procedure with randomly selected attributes and assigned real values from the nearest neighbour to the one with missing observations. Different proportions of data entries in six complete datasets were randomly selected to be missing and the MI methods were compared based on the efficiency and accuracy of estimating those values. The results indicated that the models using Bayesian analysis had slightly higher accuracy of estimation performance than those using non-Bayesian analysis but they were more time-consuming. However, the novel approach of multiple agglomerative hierarchical clustering demonstrated the overall best performances. Public Library of Science 2015-12-21 /pmc/articles/PMC4686903/ /pubmed/26689369 http://dx.doi.org/10.1371/journal.pone.0144370 Text en © 2015 Tian et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Tian, Ting
McLachlan, Geoffrey J.
Dieters, Mark J.
Basford, Kaye E.
Application of Multiple Imputation for Missing Values in Three-Way Three-Mode Multi-Environment Trial Data
title Application of Multiple Imputation for Missing Values in Three-Way Three-Mode Multi-Environment Trial Data
title_full Application of Multiple Imputation for Missing Values in Three-Way Three-Mode Multi-Environment Trial Data
title_fullStr Application of Multiple Imputation for Missing Values in Three-Way Three-Mode Multi-Environment Trial Data
title_full_unstemmed Application of Multiple Imputation for Missing Values in Three-Way Three-Mode Multi-Environment Trial Data
title_short Application of Multiple Imputation for Missing Values in Three-Way Three-Mode Multi-Environment Trial Data
title_sort application of multiple imputation for missing values in three-way three-mode multi-environment trial data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4686903/
https://www.ncbi.nlm.nih.gov/pubmed/26689369
http://dx.doi.org/10.1371/journal.pone.0144370
work_keys_str_mv AT tianting applicationofmultipleimputationformissingvaluesinthreewaythreemodemultienvironmenttrialdata
AT mclachlangeoffreyj applicationofmultipleimputationformissingvaluesinthreewaythreemodemultienvironmenttrialdata
AT dietersmarkj applicationofmultipleimputationformissingvaluesinthreewaythreemodemultienvironmenttrialdata
AT basfordkayee applicationofmultipleimputationformissingvaluesinthreewaythreemodemultienvironmenttrialdata