Cargando…

Benchmarking database systems for Genomic Selection implementation

MOTIVATION: With high-throughput genotyping systems now available, it has become feasible to fully integrate genotyping information into breeding programs. To make use of this information effectively requires DNA extraction facilities and marker production facilities that can efficiently deploy the...

Descripción completa

Detalles Bibliográficos
Autores principales:	Nti-Addae, Yaw, Matthews, Dave, Ulat, Victor Jun, Syed, Raza, Sempéré, Guilhem, Pétel, Adrien, Renner, Jon, Larmande, Pierre, Guignon, Valentin, Jones, Elizabeth, Robbins, Kelly
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2019
Materias:	Review
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6737464/ https://www.ncbi.nlm.nih.gov/pubmed/31508797 http://dx.doi.org/10.1093/database/baz096

_version_	1783450663049494528
author	Nti-Addae, Yaw Matthews, Dave Ulat, Victor Jun Syed, Raza Sempéré, Guilhem Pétel, Adrien Renner, Jon Larmande, Pierre Guignon, Valentin Jones, Elizabeth Robbins, Kelly
author_facet	Nti-Addae, Yaw Matthews, Dave Ulat, Victor Jun Syed, Raza Sempéré, Guilhem Pétel, Adrien Renner, Jon Larmande, Pierre Guignon, Valentin Jones, Elizabeth Robbins, Kelly
author_sort	Nti-Addae, Yaw
collection	PubMed
description	MOTIVATION: With high-throughput genotyping systems now available, it has become feasible to fully integrate genotyping information into breeding programs. To make use of this information effectively requires DNA extraction facilities and marker production facilities that can efficiently deploy the desired set of markers across samples with a rapid turnaround time that allows for selection before crosses needed to be made. In reality, breeders often have a short window of time to make decisions by the time they are able to collect all their phenotyping data and receive corresponding genotyping data. This presents a challenge to organize information and utilize it in downstream analyses to support decisions made by breeders. In order to implement genomic selection routinely as part of breeding programs, one would need an efficient genotyping data storage system. We selected and benchmarked six popular open-source data storage systems, including relational database management and columnar storage systems. RESULTS: We found that data extract times are greatly influenced by the orientation in which genotype data is stored in a system. HDF5 consistently performed best, in part because it can more efficiently work with both orientations of the allele matrix. AVAILABILITY: http://gobiin1.bti.cornell.edu:6083/projects/GBM/repos/benchmarking/browse
format	Online Article Text
id	pubmed-6737464
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-67374642019-09-16 Benchmarking database systems for Genomic Selection implementation Nti-Addae, Yaw Matthews, Dave Ulat, Victor Jun Syed, Raza Sempéré, Guilhem Pétel, Adrien Renner, Jon Larmande, Pierre Guignon, Valentin Jones, Elizabeth Robbins, Kelly Database (Oxford) Review MOTIVATION: With high-throughput genotyping systems now available, it has become feasible to fully integrate genotyping information into breeding programs. To make use of this information effectively requires DNA extraction facilities and marker production facilities that can efficiently deploy the desired set of markers across samples with a rapid turnaround time that allows for selection before crosses needed to be made. In reality, breeders often have a short window of time to make decisions by the time they are able to collect all their phenotyping data and receive corresponding genotyping data. This presents a challenge to organize information and utilize it in downstream analyses to support decisions made by breeders. In order to implement genomic selection routinely as part of breeding programs, one would need an efficient genotyping data storage system. We selected and benchmarked six popular open-source data storage systems, including relational database management and columnar storage systems. RESULTS: We found that data extract times are greatly influenced by the orientation in which genotype data is stored in a system. HDF5 consistently performed best, in part because it can more efficiently work with both orientations of the allele matrix. AVAILABILITY: http://gobiin1.bti.cornell.edu:6083/projects/GBM/repos/benchmarking/browse Oxford University Press 2019-09-11 /pmc/articles/PMC6737464/ /pubmed/31508797 http://dx.doi.org/10.1093/database/baz096 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Review Nti-Addae, Yaw Matthews, Dave Ulat, Victor Jun Syed, Raza Sempéré, Guilhem Pétel, Adrien Renner, Jon Larmande, Pierre Guignon, Valentin Jones, Elizabeth Robbins, Kelly Benchmarking database systems for Genomic Selection implementation
title	Benchmarking database systems for Genomic Selection implementation
title_full	Benchmarking database systems for Genomic Selection implementation
title_fullStr	Benchmarking database systems for Genomic Selection implementation
title_full_unstemmed	Benchmarking database systems for Genomic Selection implementation
title_short	Benchmarking database systems for Genomic Selection implementation
title_sort	benchmarking database systems for genomic selection implementation
topic	Review
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6737464/ https://www.ncbi.nlm.nih.gov/pubmed/31508797 http://dx.doi.org/10.1093/database/baz096
work_keys_str_mv	AT ntiaddaeyaw benchmarkingdatabasesystemsforgenomicselectionimplementation AT matthewsdave benchmarkingdatabasesystemsforgenomicselectionimplementation AT ulatvictorjun benchmarkingdatabasesystemsforgenomicselectionimplementation AT syedraza benchmarkingdatabasesystemsforgenomicselectionimplementation AT sempereguilhem benchmarkingdatabasesystemsforgenomicselectionimplementation AT peteladrien benchmarkingdatabasesystemsforgenomicselectionimplementation AT rennerjon benchmarkingdatabasesystemsforgenomicselectionimplementation AT larmandepierre benchmarkingdatabasesystemsforgenomicselectionimplementation AT guignonvalentin benchmarkingdatabasesystemsforgenomicselectionimplementation AT joneselizabeth benchmarkingdatabasesystemsforgenomicselectionimplementation AT robbinskelly benchmarkingdatabasesystemsforgenomicselectionimplementation

Benchmarking database systems for Genomic Selection implementation

Ejemplares similares