Cargando…

TrainSel: An R Package for Selection of Training Populations

A major barrier to the wider use of supervised learning in emerging applications, such as genomic selection, is the lack of sufficient and representative labeled data to train prediction models. The amount and quality of labeled training data in many applications is usually limited and therefore car...

Descripción completa

Detalles Bibliográficos
Autores principales: Akdemir, Deniz, Rio, Simon, Isidro y Sánchez, Julio
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8138169/
https://www.ncbi.nlm.nih.gov/pubmed/34025720
http://dx.doi.org/10.3389/fgene.2021.655287
Descripción
Sumario:A major barrier to the wider use of supervised learning in emerging applications, such as genomic selection, is the lack of sufficient and representative labeled data to train prediction models. The amount and quality of labeled training data in many applications is usually limited and therefore careful selection of the training examples to be labeled can be useful for improving the accuracies in predictive learning tasks. In this paper, we present an R package, TrainSel, which provides flexible, efficient, and easy-to-use tools that can be used for the selection of training populations (STP). We illustrate its use, performance, and potentials in four different supervised learning applications within and outside of the plant breeding area.