Cargando…

mixIndependR: a R package for statistical independence testing of loci in database of multi-locus genotypes

BACKGROUND: Multi-locus genotype data are widely used in population genetics and disease studies. In evaluating the utility of multi-locus data, the independence of markers is commonly considered in many genomic assessments. Generally, pairwise non-random associations are tested by linkage disequili...

Descripción completa

Detalles Bibliográficos
Autores principales: Song, Bing, Woerner, August E., Planz, John
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7788837/
https://www.ncbi.nlm.nih.gov/pubmed/33407074
http://dx.doi.org/10.1186/s12859-020-03945-0
_version_ 1783633110480453632
author Song, Bing
Woerner, August E.
Planz, John
author_facet Song, Bing
Woerner, August E.
Planz, John
author_sort Song, Bing
collection PubMed
description BACKGROUND: Multi-locus genotype data are widely used in population genetics and disease studies. In evaluating the utility of multi-locus data, the independence of markers is commonly considered in many genomic assessments. Generally, pairwise non-random associations are tested by linkage disequilibrium; however, the dependence of one panel might be triplet, quartet, or other. Therefore, a compatible and user-friendly software is necessary for testing and assessing the global linkage disequilibrium among mixed genetic data. RESULTS: This study describes a software package for testing the mutual independence of mixed genetic datasets. Mutual independence is defined as no non-random associations among all subsets of the tested panel. The new R package “mixIndependR” calculates basic genetic parameters like allele frequency, genotype frequency, heterozygosity, Hardy–Weinberg equilibrium, and linkage disequilibrium (LD) by mutual independence from population data, regardless of the type of markers, such as simple nucleotide polymorphisms, short tandem repeats, insertions and deletions, and any other genetic markers. A novel method of assessing the dependence of mixed genetic panels is developed in this study and functionally analyzed in the software package. By comparing the observed distribution of two common summary statistics (the number of heterozygous loci [K] and the number of share alleles [X]) with their expected distributions under the assumption of mutual independence, the overall independence is tested. CONCLUSION: The package “mixIndependR” is compatible to all categories of genetic markers and detects the overall non-random associations. Compared to pairwise disequilibrium, the approach described herein tends to have higher power, especially when number of markers is large. With this package, more multi-functional or stronger genetic panels can be developed, like mixed panels with different kinds of markers. In population genetics, the package “mixIndependR” makes it possible to discover more about admixture of populations, natural selection, genetic drift, and population demographics, as a more powerful method of detecting LD. Moreover, this new approach can optimize variants selection in disease studies and contribute to panel combination for treatments in multimorbidity. Application of this approach in real data is expected in the future, and this might bring a leap in the field of genetic technology. AVAILABILITY: The R package mixIndependR, is available on the Comprehensive R Archive Network (CRAN) at: https://cran.r-project.org/web/packages/mixIndependR/index.html.
format Online
Article
Text
id pubmed-7788837
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-77888372021-01-07 mixIndependR: a R package for statistical independence testing of loci in database of multi-locus genotypes Song, Bing Woerner, August E. Planz, John BMC Bioinformatics Methodology Article BACKGROUND: Multi-locus genotype data are widely used in population genetics and disease studies. In evaluating the utility of multi-locus data, the independence of markers is commonly considered in many genomic assessments. Generally, pairwise non-random associations are tested by linkage disequilibrium; however, the dependence of one panel might be triplet, quartet, or other. Therefore, a compatible and user-friendly software is necessary for testing and assessing the global linkage disequilibrium among mixed genetic data. RESULTS: This study describes a software package for testing the mutual independence of mixed genetic datasets. Mutual independence is defined as no non-random associations among all subsets of the tested panel. The new R package “mixIndependR” calculates basic genetic parameters like allele frequency, genotype frequency, heterozygosity, Hardy–Weinberg equilibrium, and linkage disequilibrium (LD) by mutual independence from population data, regardless of the type of markers, such as simple nucleotide polymorphisms, short tandem repeats, insertions and deletions, and any other genetic markers. A novel method of assessing the dependence of mixed genetic panels is developed in this study and functionally analyzed in the software package. By comparing the observed distribution of two common summary statistics (the number of heterozygous loci [K] and the number of share alleles [X]) with their expected distributions under the assumption of mutual independence, the overall independence is tested. CONCLUSION: The package “mixIndependR” is compatible to all categories of genetic markers and detects the overall non-random associations. Compared to pairwise disequilibrium, the approach described herein tends to have higher power, especially when number of markers is large. With this package, more multi-functional or stronger genetic panels can be developed, like mixed panels with different kinds of markers. In population genetics, the package “mixIndependR” makes it possible to discover more about admixture of populations, natural selection, genetic drift, and population demographics, as a more powerful method of detecting LD. Moreover, this new approach can optimize variants selection in disease studies and contribute to panel combination for treatments in multimorbidity. Application of this approach in real data is expected in the future, and this might bring a leap in the field of genetic technology. AVAILABILITY: The R package mixIndependR, is available on the Comprehensive R Archive Network (CRAN) at: https://cran.r-project.org/web/packages/mixIndependR/index.html. BioMed Central 2021-01-06 /pmc/articles/PMC7788837/ /pubmed/33407074 http://dx.doi.org/10.1186/s12859-020-03945-0 Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Song, Bing
Woerner, August E.
Planz, John
mixIndependR: a R package for statistical independence testing of loci in database of multi-locus genotypes
title mixIndependR: a R package for statistical independence testing of loci in database of multi-locus genotypes
title_full mixIndependR: a R package for statistical independence testing of loci in database of multi-locus genotypes
title_fullStr mixIndependR: a R package for statistical independence testing of loci in database of multi-locus genotypes
title_full_unstemmed mixIndependR: a R package for statistical independence testing of loci in database of multi-locus genotypes
title_short mixIndependR: a R package for statistical independence testing of loci in database of multi-locus genotypes
title_sort mixindependr: a r package for statistical independence testing of loci in database of multi-locus genotypes
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7788837/
https://www.ncbi.nlm.nih.gov/pubmed/33407074
http://dx.doi.org/10.1186/s12859-020-03945-0
work_keys_str_mv AT songbing mixindependrarpackageforstatisticalindependencetestingoflociindatabaseofmultilocusgenotypes
AT woernerauguste mixindependrarpackageforstatisticalindependencetestingoflociindatabaseofmultilocusgenotypes
AT planzjohn mixindependrarpackageforstatisticalindependencetestingoflociindatabaseofmultilocusgenotypes