Cargando…
Assessing runs of Homozygosity: a comparison of SNP Array and whole genome sequence low coverage data
BACKGROUND: Runs of Homozygosity (ROH) are genomic regions where identical haplotypes are inherited from each parent. Since their first detection due to technological advances in the late 1990s, ROHs have been shedding light on human population history and deciphering the genetic basis of monogenic...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5789638/ https://www.ncbi.nlm.nih.gov/pubmed/29378520 http://dx.doi.org/10.1186/s12864-018-4489-0 |
_version_ | 1783296321893957632 |
---|---|
author | Ceballos, Francisco C. Hazelhurst, Scott Ramsay, Michèle |
author_facet | Ceballos, Francisco C. Hazelhurst, Scott Ramsay, Michèle |
author_sort | Ceballos, Francisco C. |
collection | PubMed |
description | BACKGROUND: Runs of Homozygosity (ROH) are genomic regions where identical haplotypes are inherited from each parent. Since their first detection due to technological advances in the late 1990s, ROHs have been shedding light on human population history and deciphering the genetic basis of monogenic and complex traits and diseases. ROH studies have predominantly exploited SNP array data, but are gradually moving to whole genome sequence (WGS) data as it becomes available. WGS data, covering more genetic variability, can add value to ROH studies, but require additional considerations during analysis. RESULTS: Using SNP array and low coverage WGS data from 1885 individuals from 20 world populations, our aims were to compare ROH from the two datasets and to establish software conditions to get comparable results, thus providing guidelines for combining disparate datasets in joint ROH analyses. By allowing heterozygous SNPs per window, using the PLINK homozygosity function and non-parametric analysis, we were able to obtain non-significant differences in number ROH, mean ROH size and total sum of ROH between data sets using the different technologies for almost all populations. CONCLUSIONS: By allowing 3 heterozygous SNPs per ROH when dealing with WGS low coverage data, it is possible to establish meaningful comparisons between data using SNP array and WGS low coverage technologies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-018-4489-0) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5789638 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-57896382018-02-08 Assessing runs of Homozygosity: a comparison of SNP Array and whole genome sequence low coverage data Ceballos, Francisco C. Hazelhurst, Scott Ramsay, Michèle BMC Genomics Methodology Article BACKGROUND: Runs of Homozygosity (ROH) are genomic regions where identical haplotypes are inherited from each parent. Since their first detection due to technological advances in the late 1990s, ROHs have been shedding light on human population history and deciphering the genetic basis of monogenic and complex traits and diseases. ROH studies have predominantly exploited SNP array data, but are gradually moving to whole genome sequence (WGS) data as it becomes available. WGS data, covering more genetic variability, can add value to ROH studies, but require additional considerations during analysis. RESULTS: Using SNP array and low coverage WGS data from 1885 individuals from 20 world populations, our aims were to compare ROH from the two datasets and to establish software conditions to get comparable results, thus providing guidelines for combining disparate datasets in joint ROH analyses. By allowing heterozygous SNPs per window, using the PLINK homozygosity function and non-parametric analysis, we were able to obtain non-significant differences in number ROH, mean ROH size and total sum of ROH between data sets using the different technologies for almost all populations. CONCLUSIONS: By allowing 3 heterozygous SNPs per ROH when dealing with WGS low coverage data, it is possible to establish meaningful comparisons between data using SNP array and WGS low coverage technologies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-018-4489-0) contains supplementary material, which is available to authorized users. BioMed Central 2018-01-30 /pmc/articles/PMC5789638/ /pubmed/29378520 http://dx.doi.org/10.1186/s12864-018-4489-0 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Ceballos, Francisco C. Hazelhurst, Scott Ramsay, Michèle Assessing runs of Homozygosity: a comparison of SNP Array and whole genome sequence low coverage data |
title | Assessing runs of Homozygosity: a comparison of SNP Array and whole genome sequence low coverage data |
title_full | Assessing runs of Homozygosity: a comparison of SNP Array and whole genome sequence low coverage data |
title_fullStr | Assessing runs of Homozygosity: a comparison of SNP Array and whole genome sequence low coverage data |
title_full_unstemmed | Assessing runs of Homozygosity: a comparison of SNP Array and whole genome sequence low coverage data |
title_short | Assessing runs of Homozygosity: a comparison of SNP Array and whole genome sequence low coverage data |
title_sort | assessing runs of homozygosity: a comparison of snp array and whole genome sequence low coverage data |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5789638/ https://www.ncbi.nlm.nih.gov/pubmed/29378520 http://dx.doi.org/10.1186/s12864-018-4489-0 |
work_keys_str_mv | AT ceballosfranciscoc assessingrunsofhomozygosityacomparisonofsnparrayandwholegenomesequencelowcoveragedata AT hazelhurstscott assessingrunsofhomozygosityacomparisonofsnparrayandwholegenomesequencelowcoveragedata AT ramsaymichele assessingrunsofhomozygosityacomparisonofsnparrayandwholegenomesequencelowcoveragedata |