Cargando…

Numericware i: Identical by State Matrix Calculator

We introduce software, Numericware i, to compute identical by state (IBS) matrix based on genotypic data. Calculating an IBS matrix with a large dataset requires large computer memory and takes lengthy processing time. Numericware i addresses these challenges with 2 algorithmic methods: multithreadi...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Bongsong, Beavis, William D
Formato: Online Artículo Texto
Lenguaje:English
Publicado: SAGE Publications 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5395260/
https://www.ncbi.nlm.nih.gov/pubmed/28469375
http://dx.doi.org/10.1177/1176934316688663
_version_ 1783229847123787776
author Kim, Bongsong
Beavis, William D
author_facet Kim, Bongsong
Beavis, William D
author_sort Kim, Bongsong
collection PubMed
description We introduce software, Numericware i, to compute identical by state (IBS) matrix based on genotypic data. Calculating an IBS matrix with a large dataset requires large computer memory and takes lengthy processing time. Numericware i addresses these challenges with 2 algorithmic methods: multithreading and forward chopping. The multithreading allows computational routines to concurrently run on multiple central processing unit (CPU) processors. The forward chopping addresses memory limitation by dividing a dataset into appropriately sized subsets. Numericware i allows calculation of the IBS matrix for a large genotypic dataset using a laptop or a desktop computer. For comparison with different software, we calculated genetic relationship matrices using Numericware i, SPAGeDi, and TASSEL with the same genotypic dataset. Numericware i calculates IBS coefficients between 0 and 2, whereas SPAGeDi and TASSEL produce different ranges of values including negative values. The Pearson correlation coefficient between the matrices from Numericware i and TASSEL was high at .9972, whereas SPAGeDi showed low correlation with Numericware i (.0505) and TASSEL (.0587). With a high-dimensional dataset of 500 entities by 10 000 000 SNPs, Numericware i spent 382 minutes using 19 CPU threads and 64 GB memory by dividing the dataset into 3 pieces, whereas SPAGeDi and TASSEL failed with the same dataset. Numericware i is freely available for Windows and Linux under CC-BY 4.0 license at https://figshare.com/s/f100f33a8857131eb2db.
format Online
Article
Text
id pubmed-5395260
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher SAGE Publications
record_format MEDLINE/PubMed
spelling pubmed-53952602017-05-03 Numericware i: Identical by State Matrix Calculator Kim, Bongsong Beavis, William D Evol Bioinform Online Software or Database Review We introduce software, Numericware i, to compute identical by state (IBS) matrix based on genotypic data. Calculating an IBS matrix with a large dataset requires large computer memory and takes lengthy processing time. Numericware i addresses these challenges with 2 algorithmic methods: multithreading and forward chopping. The multithreading allows computational routines to concurrently run on multiple central processing unit (CPU) processors. The forward chopping addresses memory limitation by dividing a dataset into appropriately sized subsets. Numericware i allows calculation of the IBS matrix for a large genotypic dataset using a laptop or a desktop computer. For comparison with different software, we calculated genetic relationship matrices using Numericware i, SPAGeDi, and TASSEL with the same genotypic dataset. Numericware i calculates IBS coefficients between 0 and 2, whereas SPAGeDi and TASSEL produce different ranges of values including negative values. The Pearson correlation coefficient between the matrices from Numericware i and TASSEL was high at .9972, whereas SPAGeDi showed low correlation with Numericware i (.0505) and TASSEL (.0587). With a high-dimensional dataset of 500 entities by 10 000 000 SNPs, Numericware i spent 382 minutes using 19 CPU threads and 64 GB memory by dividing the dataset into 3 pieces, whereas SPAGeDi and TASSEL failed with the same dataset. Numericware i is freely available for Windows and Linux under CC-BY 4.0 license at https://figshare.com/s/f100f33a8857131eb2db. SAGE Publications 2017-03-10 /pmc/articles/PMC5395260/ /pubmed/28469375 http://dx.doi.org/10.1177/1176934316688663 Text en © The Author(s) 2017 http://creativecommons.org/licenses/by-nc/3.0/ This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 3.0 License (http://www.creativecommons.org/licenses/by-nc/3.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page(https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle Software or Database Review
Kim, Bongsong
Beavis, William D
Numericware i: Identical by State Matrix Calculator
title Numericware i: Identical by State Matrix Calculator
title_full Numericware i: Identical by State Matrix Calculator
title_fullStr Numericware i: Identical by State Matrix Calculator
title_full_unstemmed Numericware i: Identical by State Matrix Calculator
title_short Numericware i: Identical by State Matrix Calculator
title_sort numericware i: identical by state matrix calculator
topic Software or Database Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5395260/
https://www.ncbi.nlm.nih.gov/pubmed/28469375
http://dx.doi.org/10.1177/1176934316688663
work_keys_str_mv AT kimbongsong numericwareiidenticalbystatematrixcalculator
AT beaviswilliamd numericwareiidenticalbystatematrixcalculator