Cargando…

Fully exploiting SNP arrays: a systematic review on the tools to extract underlying genomic structure

Single nucleotide polymorphisms (SNPs) are the most abundant type of genomic variation and the most accessible to genotype in large cohorts. However, they individually explain a small proportion of phenotypic differences between individuals. Ancestry, collective SNP effects, structural variants, som...

Descripción completa

Detalles Bibliográficos
Autores principales: Balagué-Dobón, Laura, Cáceres, Alejandro, González, Juan R
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8921734/
https://www.ncbi.nlm.nih.gov/pubmed/35211719
http://dx.doi.org/10.1093/bib/bbac043
_version_ 1784669385397370880
author Balagué-Dobón, Laura
Cáceres, Alejandro
González, Juan R
author_facet Balagué-Dobón, Laura
Cáceres, Alejandro
González, Juan R
author_sort Balagué-Dobón, Laura
collection PubMed
description Single nucleotide polymorphisms (SNPs) are the most abundant type of genomic variation and the most accessible to genotype in large cohorts. However, they individually explain a small proportion of phenotypic differences between individuals. Ancestry, collective SNP effects, structural variants, somatic mutations or even differences in historic recombination can potentially explain a high percentage of genomic divergence. These genetic differences can be infrequent or laborious to characterize; however, many of them leave distinctive marks on the SNPs across the genome allowing their study in large population samples. Consequently, several methods have been developed over the last decade to detect and analyze different genomic structures using SNP arrays, to complement genome-wide association studies and determine the contribution of these structures to explain the phenotypic differences between individuals. We present an up-to-date collection of available bioinformatics tools that can be used to extract relevant genomic information from SNP array data including population structure and ancestry; polygenic risk scores; identity-by-descent fragments; linkage disequilibrium; heritability and structural variants such as inversions, copy number variants, genetic mosaicisms and recombination histories. From a systematic review of recently published applications of the methods, we describe the main characteristics of R packages, command-line tools and desktop applications, both free and commercial, to help make the most of a large amount of publicly available SNP data.
format Online
Article
Text
id pubmed-8921734
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-89217342022-03-15 Fully exploiting SNP arrays: a systematic review on the tools to extract underlying genomic structure Balagué-Dobón, Laura Cáceres, Alejandro González, Juan R Brief Bioinform Review Single nucleotide polymorphisms (SNPs) are the most abundant type of genomic variation and the most accessible to genotype in large cohorts. However, they individually explain a small proportion of phenotypic differences between individuals. Ancestry, collective SNP effects, structural variants, somatic mutations or even differences in historic recombination can potentially explain a high percentage of genomic divergence. These genetic differences can be infrequent or laborious to characterize; however, many of them leave distinctive marks on the SNPs across the genome allowing their study in large population samples. Consequently, several methods have been developed over the last decade to detect and analyze different genomic structures using SNP arrays, to complement genome-wide association studies and determine the contribution of these structures to explain the phenotypic differences between individuals. We present an up-to-date collection of available bioinformatics tools that can be used to extract relevant genomic information from SNP array data including population structure and ancestry; polygenic risk scores; identity-by-descent fragments; linkage disequilibrium; heritability and structural variants such as inversions, copy number variants, genetic mosaicisms and recombination histories. From a systematic review of recently published applications of the methods, we describe the main characteristics of R packages, command-line tools and desktop applications, both free and commercial, to help make the most of a large amount of publicly available SNP data. Oxford University Press 2022-02-24 /pmc/articles/PMC8921734/ /pubmed/35211719 http://dx.doi.org/10.1093/bib/bbac043 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Review
Balagué-Dobón, Laura
Cáceres, Alejandro
González, Juan R
Fully exploiting SNP arrays: a systematic review on the tools to extract underlying genomic structure
title Fully exploiting SNP arrays: a systematic review on the tools to extract underlying genomic structure
title_full Fully exploiting SNP arrays: a systematic review on the tools to extract underlying genomic structure
title_fullStr Fully exploiting SNP arrays: a systematic review on the tools to extract underlying genomic structure
title_full_unstemmed Fully exploiting SNP arrays: a systematic review on the tools to extract underlying genomic structure
title_short Fully exploiting SNP arrays: a systematic review on the tools to extract underlying genomic structure
title_sort fully exploiting snp arrays: a systematic review on the tools to extract underlying genomic structure
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8921734/
https://www.ncbi.nlm.nih.gov/pubmed/35211719
http://dx.doi.org/10.1093/bib/bbac043
work_keys_str_mv AT balaguedobonlaura fullyexploitingsnparraysasystematicreviewonthetoolstoextractunderlyinggenomicstructure
AT caceresalejandro fullyexploitingsnparraysasystematicreviewonthetoolstoextractunderlyinggenomicstructure
AT gonzalezjuanr fullyexploitingsnparraysasystematicreviewonthetoolstoextractunderlyinggenomicstructure