Cargando…

How to study runs of homozygosity using PLINK? A guide for analyzing medium density SNP data in livestock and pet species

BACKGROUND: PLINK is probably the most used program for analyzing SNP genotypes and runs of homozygosity (ROH), both in human and in animal populations. The last decade, ROH analyses have become the state-of-the-art method for inbreeding assessment. In PLINK, the --homozyg function is used to perfor...

Descripción completa

Detalles Bibliográficos
Autores principales: Meyermans, R., Gorssen, W., Buys, N., Janssens, S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6990544/
https://www.ncbi.nlm.nih.gov/pubmed/31996125
http://dx.doi.org/10.1186/s12864-020-6463-x
_version_ 1783492524083511296
author Meyermans, R.
Gorssen, W.
Buys, N.
Janssens, S.
author_facet Meyermans, R.
Gorssen, W.
Buys, N.
Janssens, S.
author_sort Meyermans, R.
collection PubMed
description BACKGROUND: PLINK is probably the most used program for analyzing SNP genotypes and runs of homozygosity (ROH), both in human and in animal populations. The last decade, ROH analyses have become the state-of-the-art method for inbreeding assessment. In PLINK, the --homozyg function is used to perform ROH analyses and relies on several input settings. These settings can have a large impact on the outcome and default values are not always appropriate for medium density SNP array data. Guidelines for a robust and uniform ROH analysis in PLINK using medium density data are lacking, albeit these guidelines are vital for comparing different ROH studies. In this study, 8 populations of different livestock and pet species are used to demonstrate the importance of PLINK input settings. Moreover, the effects of pruning SNPs for low minor allele frequencies and linkage disequilibrium on ROH detection are shown. RESULTS: We introduce the genome coverage parameter to appropriately estimate F(ROH) and to check the validity of ROH analyses. The effect of pruning for linkage disequilibrium and low minor allele frequencies on ROH analyses is highly population dependent and such pruning may result in missed ROH. PLINK’s minimal density requirement is crucial for medium density genotypes and if set too low, genome coverage of the ROH analysis is limited. Finally, we provide recommendations for the maximal gap, scanning window length and threshold settings. CONCLUSIONS: In this study, we present guidelines for an adequate and robust ROH analysis in PLINK on medium density SNP data. Furthermore, we advise to report parameter settings in publications, and to validate them prior to analysis. Moreover, we encourage authors to report genome coverage to reflect the ROH analysis’ validity. Implementing these guidelines will substantially improve the overall quality and uniformity of ROH analyses.
format Online
Article
Text
id pubmed-6990544
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-69905442020-02-03 How to study runs of homozygosity using PLINK? A guide for analyzing medium density SNP data in livestock and pet species Meyermans, R. Gorssen, W. Buys, N. Janssens, S. BMC Genomics Methodology Article BACKGROUND: PLINK is probably the most used program for analyzing SNP genotypes and runs of homozygosity (ROH), both in human and in animal populations. The last decade, ROH analyses have become the state-of-the-art method for inbreeding assessment. In PLINK, the --homozyg function is used to perform ROH analyses and relies on several input settings. These settings can have a large impact on the outcome and default values are not always appropriate for medium density SNP array data. Guidelines for a robust and uniform ROH analysis in PLINK using medium density data are lacking, albeit these guidelines are vital for comparing different ROH studies. In this study, 8 populations of different livestock and pet species are used to demonstrate the importance of PLINK input settings. Moreover, the effects of pruning SNPs for low minor allele frequencies and linkage disequilibrium on ROH detection are shown. RESULTS: We introduce the genome coverage parameter to appropriately estimate F(ROH) and to check the validity of ROH analyses. The effect of pruning for linkage disequilibrium and low minor allele frequencies on ROH analyses is highly population dependent and such pruning may result in missed ROH. PLINK’s minimal density requirement is crucial for medium density genotypes and if set too low, genome coverage of the ROH analysis is limited. Finally, we provide recommendations for the maximal gap, scanning window length and threshold settings. CONCLUSIONS: In this study, we present guidelines for an adequate and robust ROH analysis in PLINK on medium density SNP data. Furthermore, we advise to report parameter settings in publications, and to validate them prior to analysis. Moreover, we encourage authors to report genome coverage to reflect the ROH analysis’ validity. Implementing these guidelines will substantially improve the overall quality and uniformity of ROH analyses. BioMed Central 2020-01-29 /pmc/articles/PMC6990544/ /pubmed/31996125 http://dx.doi.org/10.1186/s12864-020-6463-x Text en © The Author(s). 2020 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Meyermans, R.
Gorssen, W.
Buys, N.
Janssens, S.
How to study runs of homozygosity using PLINK? A guide for analyzing medium density SNP data in livestock and pet species
title How to study runs of homozygosity using PLINK? A guide for analyzing medium density SNP data in livestock and pet species
title_full How to study runs of homozygosity using PLINK? A guide for analyzing medium density SNP data in livestock and pet species
title_fullStr How to study runs of homozygosity using PLINK? A guide for analyzing medium density SNP data in livestock and pet species
title_full_unstemmed How to study runs of homozygosity using PLINK? A guide for analyzing medium density SNP data in livestock and pet species
title_short How to study runs of homozygosity using PLINK? A guide for analyzing medium density SNP data in livestock and pet species
title_sort how to study runs of homozygosity using plink? a guide for analyzing medium density snp data in livestock and pet species
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6990544/
https://www.ncbi.nlm.nih.gov/pubmed/31996125
http://dx.doi.org/10.1186/s12864-020-6463-x
work_keys_str_mv AT meyermansr howtostudyrunsofhomozygosityusingplinkaguideforanalyzingmediumdensitysnpdatainlivestockandpetspecies
AT gorssenw howtostudyrunsofhomozygosityusingplinkaguideforanalyzingmediumdensitysnpdatainlivestockandpetspecies
AT buysn howtostudyrunsofhomozygosityusingplinkaguideforanalyzingmediumdensitysnpdatainlivestockandpetspecies
AT janssenss howtostudyrunsofhomozygosityusingplinkaguideforanalyzingmediumdensitysnpdatainlivestockandpetspecies