Cargando…

LDkit: a parallel computing toolkit for linkage disequilibrium analysis

BACKGROUND: Linkage disequilibrium (LD) analysis is broadly utilized in genetics to understand the evolutionary and demographic history and helps geneticists identify genes associated with interested inherited traits, such as diseases. There are some tools for linkage disequilibrium analysis either...

Descripción completa

Detalles Bibliográficos
Autores principales: Tang, You, Li, Zhuo, Wang, Chao, Liu, Yuxin, Yu, Helong, Wang, Aoxue, Zhou, Yao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7565767/
https://www.ncbi.nlm.nih.gov/pubmed/33066733
http://dx.doi.org/10.1186/s12859-020-03754-5
_version_ 1783596004686168064
author Tang, You
Li, Zhuo
Wang, Chao
Liu, Yuxin
Yu, Helong
Wang, Aoxue
Zhou, Yao
author_facet Tang, You
Li, Zhuo
Wang, Chao
Liu, Yuxin
Yu, Helong
Wang, Aoxue
Zhou, Yao
author_sort Tang, You
collection PubMed
description BACKGROUND: Linkage disequilibrium (LD) analysis is broadly utilized in genetics to understand the evolutionary and demographic history and helps geneticists identify genes associated with interested inherited traits, such as diseases. There are some tools for linkage disequilibrium analysis either in a local or online way; however, there has been no such tool supporting both graphical user interface (GUI) and parallel computing. RESULTS: We developed a GUI software called LDkit for LD analysis, which supports parallel computing. The LDkit supports both variant call format (VCF) and PLINK ‘ped + map’ format. At the same time, users could also just analyze a subset of individuals from the whole population. The LDkit reads the data by block and then paralleled the computation process by monitoring the usage of processes. Assessment on the Human 1000 genome data showed that when paralleled with 32 threads, the running time was reduced to less than 6 minutes from ~77 minutes using the chromosome 22 dataset with 1,103,547 SNPs and 2504 individuals. CONCLUSIONS: The software LDkit can be effectively used to calculate and plot LD decay, LD block, and linkage disequilibrium analysis between a site and a given region. Most importantly, both graphical user interface (GUI) and stand-alone packages are available for users’ convenience. LDkit was written in JAVA language under cross-platform support.
format Online
Article
Text
id pubmed-7565767
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-75657672020-10-20 LDkit: a parallel computing toolkit for linkage disequilibrium analysis Tang, You Li, Zhuo Wang, Chao Liu, Yuxin Yu, Helong Wang, Aoxue Zhou, Yao BMC Bioinformatics Software BACKGROUND: Linkage disequilibrium (LD) analysis is broadly utilized in genetics to understand the evolutionary and demographic history and helps geneticists identify genes associated with interested inherited traits, such as diseases. There are some tools for linkage disequilibrium analysis either in a local or online way; however, there has been no such tool supporting both graphical user interface (GUI) and parallel computing. RESULTS: We developed a GUI software called LDkit for LD analysis, which supports parallel computing. The LDkit supports both variant call format (VCF) and PLINK ‘ped + map’ format. At the same time, users could also just analyze a subset of individuals from the whole population. The LDkit reads the data by block and then paralleled the computation process by monitoring the usage of processes. Assessment on the Human 1000 genome data showed that when paralleled with 32 threads, the running time was reduced to less than 6 minutes from ~77 minutes using the chromosome 22 dataset with 1,103,547 SNPs and 2504 individuals. CONCLUSIONS: The software LDkit can be effectively used to calculate and plot LD decay, LD block, and linkage disequilibrium analysis between a site and a given region. Most importantly, both graphical user interface (GUI) and stand-alone packages are available for users’ convenience. LDkit was written in JAVA language under cross-platform support. BioMed Central 2020-10-16 /pmc/articles/PMC7565767/ /pubmed/33066733 http://dx.doi.org/10.1186/s12859-020-03754-5 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Software
Tang, You
Li, Zhuo
Wang, Chao
Liu, Yuxin
Yu, Helong
Wang, Aoxue
Zhou, Yao
LDkit: a parallel computing toolkit for linkage disequilibrium analysis
title LDkit: a parallel computing toolkit for linkage disequilibrium analysis
title_full LDkit: a parallel computing toolkit for linkage disequilibrium analysis
title_fullStr LDkit: a parallel computing toolkit for linkage disequilibrium analysis
title_full_unstemmed LDkit: a parallel computing toolkit for linkage disequilibrium analysis
title_short LDkit: a parallel computing toolkit for linkage disequilibrium analysis
title_sort ldkit: a parallel computing toolkit for linkage disequilibrium analysis
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7565767/
https://www.ncbi.nlm.nih.gov/pubmed/33066733
http://dx.doi.org/10.1186/s12859-020-03754-5
work_keys_str_mv AT tangyou ldkitaparallelcomputingtoolkitforlinkagedisequilibriumanalysis
AT lizhuo ldkitaparallelcomputingtoolkitforlinkagedisequilibriumanalysis
AT wangchao ldkitaparallelcomputingtoolkitforlinkagedisequilibriumanalysis
AT liuyuxin ldkitaparallelcomputingtoolkitforlinkagedisequilibriumanalysis
AT yuhelong ldkitaparallelcomputingtoolkitforlinkagedisequilibriumanalysis
AT wangaoxue ldkitaparallelcomputingtoolkitforlinkagedisequilibriumanalysis
AT zhouyao ldkitaparallelcomputingtoolkitforlinkagedisequilibriumanalysis