Cargando…
Numericware i: Identical by State Matrix Calculator
We introduce software, Numericware i, to compute identical by state (IBS) matrix based on genotypic data. Calculating an IBS matrix with a large dataset requires large computer memory and takes lengthy processing time. Numericware i addresses these challenges with 2 algorithmic methods: multithreadi...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
SAGE Publications
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5395260/ https://www.ncbi.nlm.nih.gov/pubmed/28469375 http://dx.doi.org/10.1177/1176934316688663 |
_version_ | 1783229847123787776 |
---|---|
author | Kim, Bongsong Beavis, William D |
author_facet | Kim, Bongsong Beavis, William D |
author_sort | Kim, Bongsong |
collection | PubMed |
description | We introduce software, Numericware i, to compute identical by state (IBS) matrix based on genotypic data. Calculating an IBS matrix with a large dataset requires large computer memory and takes lengthy processing time. Numericware i addresses these challenges with 2 algorithmic methods: multithreading and forward chopping. The multithreading allows computational routines to concurrently run on multiple central processing unit (CPU) processors. The forward chopping addresses memory limitation by dividing a dataset into appropriately sized subsets. Numericware i allows calculation of the IBS matrix for a large genotypic dataset using a laptop or a desktop computer. For comparison with different software, we calculated genetic relationship matrices using Numericware i, SPAGeDi, and TASSEL with the same genotypic dataset. Numericware i calculates IBS coefficients between 0 and 2, whereas SPAGeDi and TASSEL produce different ranges of values including negative values. The Pearson correlation coefficient between the matrices from Numericware i and TASSEL was high at .9972, whereas SPAGeDi showed low correlation with Numericware i (.0505) and TASSEL (.0587). With a high-dimensional dataset of 500 entities by 10 000 000 SNPs, Numericware i spent 382 minutes using 19 CPU threads and 64 GB memory by dividing the dataset into 3 pieces, whereas SPAGeDi and TASSEL failed with the same dataset. Numericware i is freely available for Windows and Linux under CC-BY 4.0 license at https://figshare.com/s/f100f33a8857131eb2db. |
format | Online Article Text |
id | pubmed-5395260 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | SAGE Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-53952602017-05-03 Numericware i: Identical by State Matrix Calculator Kim, Bongsong Beavis, William D Evol Bioinform Online Software or Database Review We introduce software, Numericware i, to compute identical by state (IBS) matrix based on genotypic data. Calculating an IBS matrix with a large dataset requires large computer memory and takes lengthy processing time. Numericware i addresses these challenges with 2 algorithmic methods: multithreading and forward chopping. The multithreading allows computational routines to concurrently run on multiple central processing unit (CPU) processors. The forward chopping addresses memory limitation by dividing a dataset into appropriately sized subsets. Numericware i allows calculation of the IBS matrix for a large genotypic dataset using a laptop or a desktop computer. For comparison with different software, we calculated genetic relationship matrices using Numericware i, SPAGeDi, and TASSEL with the same genotypic dataset. Numericware i calculates IBS coefficients between 0 and 2, whereas SPAGeDi and TASSEL produce different ranges of values including negative values. The Pearson correlation coefficient between the matrices from Numericware i and TASSEL was high at .9972, whereas SPAGeDi showed low correlation with Numericware i (.0505) and TASSEL (.0587). With a high-dimensional dataset of 500 entities by 10 000 000 SNPs, Numericware i spent 382 minutes using 19 CPU threads and 64 GB memory by dividing the dataset into 3 pieces, whereas SPAGeDi and TASSEL failed with the same dataset. Numericware i is freely available for Windows and Linux under CC-BY 4.0 license at https://figshare.com/s/f100f33a8857131eb2db. SAGE Publications 2017-03-10 /pmc/articles/PMC5395260/ /pubmed/28469375 http://dx.doi.org/10.1177/1176934316688663 Text en © The Author(s) 2017 http://creativecommons.org/licenses/by-nc/3.0/ This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 3.0 License (http://www.creativecommons.org/licenses/by-nc/3.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page(https://us.sagepub.com/en-us/nam/open-access-at-sage). |
spellingShingle | Software or Database Review Kim, Bongsong Beavis, William D Numericware i: Identical by State Matrix Calculator |
title | Numericware i: Identical by State Matrix Calculator |
title_full | Numericware i: Identical by State Matrix Calculator |
title_fullStr | Numericware i: Identical by State Matrix Calculator |
title_full_unstemmed | Numericware i: Identical by State Matrix Calculator |
title_short | Numericware i: Identical by State Matrix Calculator |
title_sort | numericware i: identical by state matrix calculator |
topic | Software or Database Review |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5395260/ https://www.ncbi.nlm.nih.gov/pubmed/28469375 http://dx.doi.org/10.1177/1176934316688663 |
work_keys_str_mv | AT kimbongsong numericwareiidenticalbystatematrixcalculator AT beaviswilliamd numericwareiidenticalbystatematrixcalculator |