Cargando…

Influence of Outliers on Accuracy Estimation in Genomic Prediction in Plant Breeding

Outliers often pose problems in analyses of data in plant breeding, but their influence on the performance of methods for estimating predictive accuracy in genomic prediction studies has not yet been evaluated. Here, we evaluate the influence of outliers on the performance of methods for accuracy es...

Descripción completa

Detalles Bibliográficos
Autores principales: Estaghvirou, Sidi Boubacar Ould, Ogutu, Joseph O., Piepho, Hans-Peter
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Genetics Society of America 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4267928/
https://www.ncbi.nlm.nih.gov/pubmed/25273862
http://dx.doi.org/10.1534/g3.114.011957
_version_ 1782349213079699456
author Estaghvirou, Sidi Boubacar Ould
Ogutu, Joseph O.
Piepho, Hans-Peter
author_facet Estaghvirou, Sidi Boubacar Ould
Ogutu, Joseph O.
Piepho, Hans-Peter
author_sort Estaghvirou, Sidi Boubacar Ould
collection PubMed
description Outliers often pose problems in analyses of data in plant breeding, but their influence on the performance of methods for estimating predictive accuracy in genomic prediction studies has not yet been evaluated. Here, we evaluate the influence of outliers on the performance of methods for accuracy estimation in genomic prediction studies using simulation. We simulated 1000 datasets for each of 10 scenarios to evaluate the influence of outliers on the performance of seven methods for estimating accuracy. These scenarios are defined by the number of genotypes, marker effect variance, and magnitude of outliers. To mimic outliers, we added to one observation in each simulated dataset, in turn, 5-, 8-, and 10-times the error SD used to simulate small and large phenotypic datasets. The effect of outliers on accuracy estimation was evaluated by comparing deviations in the estimated and true accuracies for datasets with and without outliers. Outliers adversely influenced accuracy estimation, more so at small values of genetic variance or number of genotypes. A method for estimating heritability and predictive accuracy in plant breeding and another used to estimate accuracy in animal breeding were the most accurate and resistant to outliers across all scenarios and are therefore preferable for accuracy estimation in genomic prediction studies. The performances of the other five methods that use cross-validation were less consistent and varied widely across scenarios. The computing time for the methods increased as the size of outliers and sample size increased and the genetic variance decreased.
format Online
Article
Text
id pubmed-4267928
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Genetics Society of America
record_format MEDLINE/PubMed
spelling pubmed-42679282014-12-23 Influence of Outliers on Accuracy Estimation in Genomic Prediction in Plant Breeding Estaghvirou, Sidi Boubacar Ould Ogutu, Joseph O. Piepho, Hans-Peter G3 (Bethesda) Genomic Selection Outliers often pose problems in analyses of data in plant breeding, but their influence on the performance of methods for estimating predictive accuracy in genomic prediction studies has not yet been evaluated. Here, we evaluate the influence of outliers on the performance of methods for accuracy estimation in genomic prediction studies using simulation. We simulated 1000 datasets for each of 10 scenarios to evaluate the influence of outliers on the performance of seven methods for estimating accuracy. These scenarios are defined by the number of genotypes, marker effect variance, and magnitude of outliers. To mimic outliers, we added to one observation in each simulated dataset, in turn, 5-, 8-, and 10-times the error SD used to simulate small and large phenotypic datasets. The effect of outliers on accuracy estimation was evaluated by comparing deviations in the estimated and true accuracies for datasets with and without outliers. Outliers adversely influenced accuracy estimation, more so at small values of genetic variance or number of genotypes. A method for estimating heritability and predictive accuracy in plant breeding and another used to estimate accuracy in animal breeding were the most accurate and resistant to outliers across all scenarios and are therefore preferable for accuracy estimation in genomic prediction studies. The performances of the other five methods that use cross-validation were less consistent and varied widely across scenarios. The computing time for the methods increased as the size of outliers and sample size increased and the genetic variance decreased. Genetics Society of America 2014-10-01 /pmc/articles/PMC4267928/ /pubmed/25273862 http://dx.doi.org/10.1534/g3.114.011957 Text en Copyright © 2014 Ould Estaghvirou et al. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution Unported License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Genomic Selection
Estaghvirou, Sidi Boubacar Ould
Ogutu, Joseph O.
Piepho, Hans-Peter
Influence of Outliers on Accuracy Estimation in Genomic Prediction in Plant Breeding
title Influence of Outliers on Accuracy Estimation in Genomic Prediction in Plant Breeding
title_full Influence of Outliers on Accuracy Estimation in Genomic Prediction in Plant Breeding
title_fullStr Influence of Outliers on Accuracy Estimation in Genomic Prediction in Plant Breeding
title_full_unstemmed Influence of Outliers on Accuracy Estimation in Genomic Prediction in Plant Breeding
title_short Influence of Outliers on Accuracy Estimation in Genomic Prediction in Plant Breeding
title_sort influence of outliers on accuracy estimation in genomic prediction in plant breeding
topic Genomic Selection
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4267928/
https://www.ncbi.nlm.nih.gov/pubmed/25273862
http://dx.doi.org/10.1534/g3.114.011957
work_keys_str_mv AT estaghvirousidiboubacarould influenceofoutliersonaccuracyestimationingenomicpredictioninplantbreeding
AT ogutujosepho influenceofoutliersonaccuracyestimationingenomicpredictioninplantbreeding
AT piephohanspeter influenceofoutliersonaccuracyestimationingenomicpredictioninplantbreeding