Cargando…

A Log-Ratio Biplot Approach for Exploring Genetic Relatedness Based on Identity by State

The detection of cryptic relatedness in large population-based cohorts is of great importance in genome research. The usual approach for detecting closely related individuals is to plot allele sharing statistics, based on identity-by-state or identity-by-descent, in a two-dimensional scatterplot. Th...

Descripción completa

Detalles Bibliográficos
Autores principales: Graffelman, Jan, Galván Femenía, Iván, de Cid, Rafael, Barceló Vidal, Carles
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6491861/
https://www.ncbi.nlm.nih.gov/pubmed/31068965
http://dx.doi.org/10.3389/fgene.2019.00341
_version_ 1783415032944525312
author Graffelman, Jan
Galván Femenía, Iván
de Cid, Rafael
Barceló Vidal, Carles
author_facet Graffelman, Jan
Galván Femenía, Iván
de Cid, Rafael
Barceló Vidal, Carles
author_sort Graffelman, Jan
collection PubMed
description The detection of cryptic relatedness in large population-based cohorts is of great importance in genome research. The usual approach for detecting closely related individuals is to plot allele sharing statistics, based on identity-by-state or identity-by-descent, in a two-dimensional scatterplot. This approach ignores that allele sharing data across individuals has in reality a higher dimensionality, and neither regards the compositional nature of the underlying counts of shared genotypes. In this paper we develop biplot methodology based on log-ratio principal component analysis that overcomes these restrictions. This leads to entirely new graphics that are essentially useful for exploring relatedness in genetic databases from homogeneous populations. The proposed method can be applied in an iterative manner, acting as a looking glass for more remote relationships that are harder to classify. Datasets from the 1,000 Genomes Project and the Genomes For Life-GCAT Project are used to illustrate the proposed method. The discriminatory power of the log-ratio biplot approach is compared with the classical plots in a simulation study. In a non-inbred homogeneous population the classification rate of the log-ratio principal component approach outperforms the classical graphics across the whole allele frequency spectrum, using only identity by state. In these circumstances, simulations show that with 35,000 independent bi-allelic variants, log-ratio principal component analysis, combined with discriminant analysis, can correctly classify relationships up to and including the fourth degree.
format Online
Article
Text
id pubmed-6491861
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-64918612019-05-08 A Log-Ratio Biplot Approach for Exploring Genetic Relatedness Based on Identity by State Graffelman, Jan Galván Femenía, Iván de Cid, Rafael Barceló Vidal, Carles Front Genet Genetics The detection of cryptic relatedness in large population-based cohorts is of great importance in genome research. The usual approach for detecting closely related individuals is to plot allele sharing statistics, based on identity-by-state or identity-by-descent, in a two-dimensional scatterplot. This approach ignores that allele sharing data across individuals has in reality a higher dimensionality, and neither regards the compositional nature of the underlying counts of shared genotypes. In this paper we develop biplot methodology based on log-ratio principal component analysis that overcomes these restrictions. This leads to entirely new graphics that are essentially useful for exploring relatedness in genetic databases from homogeneous populations. The proposed method can be applied in an iterative manner, acting as a looking glass for more remote relationships that are harder to classify. Datasets from the 1,000 Genomes Project and the Genomes For Life-GCAT Project are used to illustrate the proposed method. The discriminatory power of the log-ratio biplot approach is compared with the classical plots in a simulation study. In a non-inbred homogeneous population the classification rate of the log-ratio principal component approach outperforms the classical graphics across the whole allele frequency spectrum, using only identity by state. In these circumstances, simulations show that with 35,000 independent bi-allelic variants, log-ratio principal component analysis, combined with discriminant analysis, can correctly classify relationships up to and including the fourth degree. Frontiers Media S.A. 2019-04-24 /pmc/articles/PMC6491861/ /pubmed/31068965 http://dx.doi.org/10.3389/fgene.2019.00341 Text en Copyright © 2019 Graffelman, Galván Femenía, de Cid and Barceló Vidal. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Graffelman, Jan
Galván Femenía, Iván
de Cid, Rafael
Barceló Vidal, Carles
A Log-Ratio Biplot Approach for Exploring Genetic Relatedness Based on Identity by State
title A Log-Ratio Biplot Approach for Exploring Genetic Relatedness Based on Identity by State
title_full A Log-Ratio Biplot Approach for Exploring Genetic Relatedness Based on Identity by State
title_fullStr A Log-Ratio Biplot Approach for Exploring Genetic Relatedness Based on Identity by State
title_full_unstemmed A Log-Ratio Biplot Approach for Exploring Genetic Relatedness Based on Identity by State
title_short A Log-Ratio Biplot Approach for Exploring Genetic Relatedness Based on Identity by State
title_sort log-ratio biplot approach for exploring genetic relatedness based on identity by state
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6491861/
https://www.ncbi.nlm.nih.gov/pubmed/31068965
http://dx.doi.org/10.3389/fgene.2019.00341
work_keys_str_mv AT graffelmanjan alogratiobiplotapproachforexploringgeneticrelatednessbasedonidentitybystate
AT galvanfemeniaivan alogratiobiplotapproachforexploringgeneticrelatednessbasedonidentitybystate
AT decidrafael alogratiobiplotapproachforexploringgeneticrelatednessbasedonidentitybystate
AT barcelovidalcarles alogratiobiplotapproachforexploringgeneticrelatednessbasedonidentitybystate
AT graffelmanjan logratiobiplotapproachforexploringgeneticrelatednessbasedonidentitybystate
AT galvanfemeniaivan logratiobiplotapproachforexploringgeneticrelatednessbasedonidentitybystate
AT decidrafael logratiobiplotapproachforexploringgeneticrelatednessbasedonidentitybystate
AT barcelovidalcarles logratiobiplotapproachforexploringgeneticrelatednessbasedonidentitybystate