Cargando…

ICGRM: integrative construction of genomic relationship matrix combining multiple genomic regions for big dataset

BACKGROUND: Genomic prediction is an advanced method for estimating genetic values, which has been widely accepted for genetic evaluation in animal and disease-risk prediction in human. It estimates genetic values with genome-wide distributed SNPs instead of pedigree. The key step of it is to constr...

Descripción completa

Detalles Bibliográficos
Autores principales: Jiang, Dan, Xin, Cong, Ye, Jinhua, Yuan, Yingbo, Fang, Ming
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6933885/
https://www.ncbi.nlm.nih.gov/pubmed/31878869
http://dx.doi.org/10.1186/s12859-019-3319-y
_version_ 1783483296086228992
author Jiang, Dan
Xin, Cong
Ye, Jinhua
Yuan, Yingbo
Fang, Ming
author_facet Jiang, Dan
Xin, Cong
Ye, Jinhua
Yuan, Yingbo
Fang, Ming
author_sort Jiang, Dan
collection PubMed
description BACKGROUND: Genomic prediction is an advanced method for estimating genetic values, which has been widely accepted for genetic evaluation in animal and disease-risk prediction in human. It estimates genetic values with genome-wide distributed SNPs instead of pedigree. The key step of it is to construct genomic relationship matrix (GRM) via genome-wide SNPs; however, usually the calculation of GRM needs huge computer memory especially when the SNP number and sample size are big, so that sometimes it will become computationally prohibitive even for super computer clusters. We herein developed an integrative algorithm to compute GRM. To avoid calculating GRM for the whole genome, ICGRM freely divides the genome-wide SNPs into several segments and computes the summary statistics related to GRM for each segment that requires quite few computer RAM; then it integrates these summary statistics to produce GRM for whole genome. RESULTS: It showed that the computer memory of ICGRM was reduced by 15 times (from 218Gb to 14Gb) after the genome SNPs were split into 5 to 200 parts in terms of the number of SNPs in our simulation dataset, making it computationally feasible for almost all kinds of computer servers. ICGRM is implemented in C/C++ and freely available via https://github.com/mingfang618/CLGRM. CONCLUSIONS: ICGRM is computationally efficient software to build GRM and can be used for big dataset.
format Online
Article
Text
id pubmed-6933885
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-69338852019-12-30 ICGRM: integrative construction of genomic relationship matrix combining multiple genomic regions for big dataset Jiang, Dan Xin, Cong Ye, Jinhua Yuan, Yingbo Fang, Ming BMC Bioinformatics Software BACKGROUND: Genomic prediction is an advanced method for estimating genetic values, which has been widely accepted for genetic evaluation in animal and disease-risk prediction in human. It estimates genetic values with genome-wide distributed SNPs instead of pedigree. The key step of it is to construct genomic relationship matrix (GRM) via genome-wide SNPs; however, usually the calculation of GRM needs huge computer memory especially when the SNP number and sample size are big, so that sometimes it will become computationally prohibitive even for super computer clusters. We herein developed an integrative algorithm to compute GRM. To avoid calculating GRM for the whole genome, ICGRM freely divides the genome-wide SNPs into several segments and computes the summary statistics related to GRM for each segment that requires quite few computer RAM; then it integrates these summary statistics to produce GRM for whole genome. RESULTS: It showed that the computer memory of ICGRM was reduced by 15 times (from 218Gb to 14Gb) after the genome SNPs were split into 5 to 200 parts in terms of the number of SNPs in our simulation dataset, making it computationally feasible for almost all kinds of computer servers. ICGRM is implemented in C/C++ and freely available via https://github.com/mingfang618/CLGRM. CONCLUSIONS: ICGRM is computationally efficient software to build GRM and can be used for big dataset. BioMed Central 2019-12-26 /pmc/articles/PMC6933885/ /pubmed/31878869 http://dx.doi.org/10.1186/s12859-019-3319-y Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Jiang, Dan
Xin, Cong
Ye, Jinhua
Yuan, Yingbo
Fang, Ming
ICGRM: integrative construction of genomic relationship matrix combining multiple genomic regions for big dataset
title ICGRM: integrative construction of genomic relationship matrix combining multiple genomic regions for big dataset
title_full ICGRM: integrative construction of genomic relationship matrix combining multiple genomic regions for big dataset
title_fullStr ICGRM: integrative construction of genomic relationship matrix combining multiple genomic regions for big dataset
title_full_unstemmed ICGRM: integrative construction of genomic relationship matrix combining multiple genomic regions for big dataset
title_short ICGRM: integrative construction of genomic relationship matrix combining multiple genomic regions for big dataset
title_sort icgrm: integrative construction of genomic relationship matrix combining multiple genomic regions for big dataset
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6933885/
https://www.ncbi.nlm.nih.gov/pubmed/31878869
http://dx.doi.org/10.1186/s12859-019-3319-y
work_keys_str_mv AT jiangdan icgrmintegrativeconstructionofgenomicrelationshipmatrixcombiningmultiplegenomicregionsforbigdataset
AT xincong icgrmintegrativeconstructionofgenomicrelationshipmatrixcombiningmultiplegenomicregionsforbigdataset
AT yejinhua icgrmintegrativeconstructionofgenomicrelationshipmatrixcombiningmultiplegenomicregionsforbigdataset
AT yuanyingbo icgrmintegrativeconstructionofgenomicrelationshipmatrixcombiningmultiplegenomicregionsforbigdataset
AT fangming icgrmintegrativeconstructionofgenomicrelationshipmatrixcombiningmultiplegenomicregionsforbigdataset