Cargando…

GVCHAP: A Computing Pipeline for Genomic Prediction and Variance Component Estimation Using Haplotypes and SNP Markers

Haplotype prediction models open many possibilities to improve the accuracy of genomic selection but require more data processing and computing time than single-SNP prediction models. To facilitate haplotype analysis for genomic prediction and estimation using structural and functional genomic infor...

Descripción completa

Detalles Bibliográficos
Autores principales:	Prakapenka, Dzianis, Wang, Chunkao, Liang, Zuoxiang, Bian, Cheng, Tan, Cheng, Da, Yang
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2020
Materias:	Genetics
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7154123/ https://www.ncbi.nlm.nih.gov/pubmed/32318093 http://dx.doi.org/10.3389/fgene.2020.00282

_version_	1783521770212425728
author	Prakapenka, Dzianis Wang, Chunkao Liang, Zuoxiang Bian, Cheng Tan, Cheng Da, Yang
author_facet	Prakapenka, Dzianis Wang, Chunkao Liang, Zuoxiang Bian, Cheng Tan, Cheng Da, Yang
author_sort	Prakapenka, Dzianis
collection	PubMed
description	Haplotype prediction models open many possibilities to improve the accuracy of genomic selection but require more data processing and computing time than single-SNP prediction models. To facilitate haplotype analysis for genomic prediction and estimation using structural and functional genomic information, we developed a computing pipeline to implement haplotype analysis with capabilities for preparation of input data for haplotype analysis, genomic prediction and estimation using GVCHAP, and analysis of GVCHAP results. Data preparation includes utility programs for haplotype imputing; defining haplotype blocks by a fixed number of SNPs, a fixed distance in base pairs per block, or user defined block lengths based on structural or functional genomic information or a mixture of both types of information; and defining haplotype genotypes within each haplotype block. GVCHAP is the main program for genomic prediction and estimation, calculates GREML (genomic restricted maximum likelihood) estimates of variance components and heritabilities, and calculates GBLUP (genomic best linear unbiased prediction) for additive and dominance values of single SNPs as well as additive values of haplotypes with reliability estimates for training and validation populations. A two-step strategy and a method of multi-node processing are implemented to remove the computing bottleneck due to the creation of genomic relationship matrices for large samples. The analysis of GVCHAP results includes calculation of observed prediction accuracies from validation studies and preparation of input files for graphical visualization of heritability estimates of haplotype blocks as well as estimates of SNP effects and heritabilities. The entire pipeline provides an efficient and versatile computing tool for identifying the most accurate haplotype model among many candidate haplotype models utilizing structural and functional genomic information for genomic selection.
format	Online Article Text
id	pubmed-7154123
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-71541232020-04-21 GVCHAP: A Computing Pipeline for Genomic Prediction and Variance Component Estimation Using Haplotypes and SNP Markers Prakapenka, Dzianis Wang, Chunkao Liang, Zuoxiang Bian, Cheng Tan, Cheng Da, Yang Front Genet Genetics Haplotype prediction models open many possibilities to improve the accuracy of genomic selection but require more data processing and computing time than single-SNP prediction models. To facilitate haplotype analysis for genomic prediction and estimation using structural and functional genomic information, we developed a computing pipeline to implement haplotype analysis with capabilities for preparation of input data for haplotype analysis, genomic prediction and estimation using GVCHAP, and analysis of GVCHAP results. Data preparation includes utility programs for haplotype imputing; defining haplotype blocks by a fixed number of SNPs, a fixed distance in base pairs per block, or user defined block lengths based on structural or functional genomic information or a mixture of both types of information; and defining haplotype genotypes within each haplotype block. GVCHAP is the main program for genomic prediction and estimation, calculates GREML (genomic restricted maximum likelihood) estimates of variance components and heritabilities, and calculates GBLUP (genomic best linear unbiased prediction) for additive and dominance values of single SNPs as well as additive values of haplotypes with reliability estimates for training and validation populations. A two-step strategy and a method of multi-node processing are implemented to remove the computing bottleneck due to the creation of genomic relationship matrices for large samples. The analysis of GVCHAP results includes calculation of observed prediction accuracies from validation studies and preparation of input files for graphical visualization of heritability estimates of haplotype blocks as well as estimates of SNP effects and heritabilities. The entire pipeline provides an efficient and versatile computing tool for identifying the most accurate haplotype model among many candidate haplotype models utilizing structural and functional genomic information for genomic selection. Frontiers Media S.A. 2020-04-07 /pmc/articles/PMC7154123/ /pubmed/32318093 http://dx.doi.org/10.3389/fgene.2020.00282 Text en Copyright © 2020 Prakapenka, Wang, Liang, Bian, Tan and Da. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Genetics Prakapenka, Dzianis Wang, Chunkao Liang, Zuoxiang Bian, Cheng Tan, Cheng Da, Yang GVCHAP: A Computing Pipeline for Genomic Prediction and Variance Component Estimation Using Haplotypes and SNP Markers
title	GVCHAP: A Computing Pipeline for Genomic Prediction and Variance Component Estimation Using Haplotypes and SNP Markers
title_full	GVCHAP: A Computing Pipeline for Genomic Prediction and Variance Component Estimation Using Haplotypes and SNP Markers
title_fullStr	GVCHAP: A Computing Pipeline for Genomic Prediction and Variance Component Estimation Using Haplotypes and SNP Markers
title_full_unstemmed	GVCHAP: A Computing Pipeline for Genomic Prediction and Variance Component Estimation Using Haplotypes and SNP Markers
title_short	GVCHAP: A Computing Pipeline for Genomic Prediction and Variance Component Estimation Using Haplotypes and SNP Markers
title_sort	gvchap: a computing pipeline for genomic prediction and variance component estimation using haplotypes and snp markers
topic	Genetics
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7154123/ https://www.ncbi.nlm.nih.gov/pubmed/32318093 http://dx.doi.org/10.3389/fgene.2020.00282
work_keys_str_mv	AT prakapenkadzianis gvchapacomputingpipelineforgenomicpredictionandvariancecomponentestimationusinghaplotypesandsnpmarkers AT wangchunkao gvchapacomputingpipelineforgenomicpredictionandvariancecomponentestimationusinghaplotypesandsnpmarkers AT liangzuoxiang gvchapacomputingpipelineforgenomicpredictionandvariancecomponentestimationusinghaplotypesandsnpmarkers AT biancheng gvchapacomputingpipelineforgenomicpredictionandvariancecomponentestimationusinghaplotypesandsnpmarkers AT tancheng gvchapacomputingpipelineforgenomicpredictionandvariancecomponentestimationusinghaplotypesandsnpmarkers AT dayang gvchapacomputingpipelineforgenomicpredictionandvariancecomponentestimationusinghaplotypesandsnpmarkers

GVCHAP: A Computing Pipeline for Genomic Prediction and Variance Component Estimation Using Haplotypes and SNP Markers

Ejemplares similares