Cargando…
A Log-Ratio Biplot Approach for Exploring Genetic Relatedness Based on Identity by State
The detection of cryptic relatedness in large population-based cohorts is of great importance in genome research. The usual approach for detecting closely related individuals is to plot allele sharing statistics, based on identity-by-state or identity-by-descent, in a two-dimensional scatterplot. Th...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6491861/ https://www.ncbi.nlm.nih.gov/pubmed/31068965 http://dx.doi.org/10.3389/fgene.2019.00341 |
_version_ | 1783415032944525312 |
---|---|
author | Graffelman, Jan Galván Femenía, Iván de Cid, Rafael Barceló Vidal, Carles |
author_facet | Graffelman, Jan Galván Femenía, Iván de Cid, Rafael Barceló Vidal, Carles |
author_sort | Graffelman, Jan |
collection | PubMed |
description | The detection of cryptic relatedness in large population-based cohorts is of great importance in genome research. The usual approach for detecting closely related individuals is to plot allele sharing statistics, based on identity-by-state or identity-by-descent, in a two-dimensional scatterplot. This approach ignores that allele sharing data across individuals has in reality a higher dimensionality, and neither regards the compositional nature of the underlying counts of shared genotypes. In this paper we develop biplot methodology based on log-ratio principal component analysis that overcomes these restrictions. This leads to entirely new graphics that are essentially useful for exploring relatedness in genetic databases from homogeneous populations. The proposed method can be applied in an iterative manner, acting as a looking glass for more remote relationships that are harder to classify. Datasets from the 1,000 Genomes Project and the Genomes For Life-GCAT Project are used to illustrate the proposed method. The discriminatory power of the log-ratio biplot approach is compared with the classical plots in a simulation study. In a non-inbred homogeneous population the classification rate of the log-ratio principal component approach outperforms the classical graphics across the whole allele frequency spectrum, using only identity by state. In these circumstances, simulations show that with 35,000 independent bi-allelic variants, log-ratio principal component analysis, combined with discriminant analysis, can correctly classify relationships up to and including the fourth degree. |
format | Online Article Text |
id | pubmed-6491861 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-64918612019-05-08 A Log-Ratio Biplot Approach for Exploring Genetic Relatedness Based on Identity by State Graffelman, Jan Galván Femenía, Iván de Cid, Rafael Barceló Vidal, Carles Front Genet Genetics The detection of cryptic relatedness in large population-based cohorts is of great importance in genome research. The usual approach for detecting closely related individuals is to plot allele sharing statistics, based on identity-by-state or identity-by-descent, in a two-dimensional scatterplot. This approach ignores that allele sharing data across individuals has in reality a higher dimensionality, and neither regards the compositional nature of the underlying counts of shared genotypes. In this paper we develop biplot methodology based on log-ratio principal component analysis that overcomes these restrictions. This leads to entirely new graphics that are essentially useful for exploring relatedness in genetic databases from homogeneous populations. The proposed method can be applied in an iterative manner, acting as a looking glass for more remote relationships that are harder to classify. Datasets from the 1,000 Genomes Project and the Genomes For Life-GCAT Project are used to illustrate the proposed method. The discriminatory power of the log-ratio biplot approach is compared with the classical plots in a simulation study. In a non-inbred homogeneous population the classification rate of the log-ratio principal component approach outperforms the classical graphics across the whole allele frequency spectrum, using only identity by state. In these circumstances, simulations show that with 35,000 independent bi-allelic variants, log-ratio principal component analysis, combined with discriminant analysis, can correctly classify relationships up to and including the fourth degree. Frontiers Media S.A. 2019-04-24 /pmc/articles/PMC6491861/ /pubmed/31068965 http://dx.doi.org/10.3389/fgene.2019.00341 Text en Copyright © 2019 Graffelman, Galván Femenía, de Cid and Barceló Vidal. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Graffelman, Jan Galván Femenía, Iván de Cid, Rafael Barceló Vidal, Carles A Log-Ratio Biplot Approach for Exploring Genetic Relatedness Based on Identity by State |
title | A Log-Ratio Biplot Approach for Exploring Genetic Relatedness Based on Identity by State |
title_full | A Log-Ratio Biplot Approach for Exploring Genetic Relatedness Based on Identity by State |
title_fullStr | A Log-Ratio Biplot Approach for Exploring Genetic Relatedness Based on Identity by State |
title_full_unstemmed | A Log-Ratio Biplot Approach for Exploring Genetic Relatedness Based on Identity by State |
title_short | A Log-Ratio Biplot Approach for Exploring Genetic Relatedness Based on Identity by State |
title_sort | log-ratio biplot approach for exploring genetic relatedness based on identity by state |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6491861/ https://www.ncbi.nlm.nih.gov/pubmed/31068965 http://dx.doi.org/10.3389/fgene.2019.00341 |
work_keys_str_mv | AT graffelmanjan alogratiobiplotapproachforexploringgeneticrelatednessbasedonidentitybystate AT galvanfemeniaivan alogratiobiplotapproachforexploringgeneticrelatednessbasedonidentitybystate AT decidrafael alogratiobiplotapproachforexploringgeneticrelatednessbasedonidentitybystate AT barcelovidalcarles alogratiobiplotapproachforexploringgeneticrelatednessbasedonidentitybystate AT graffelmanjan logratiobiplotapproachforexploringgeneticrelatednessbasedonidentitybystate AT galvanfemeniaivan logratiobiplotapproachforexploringgeneticrelatednessbasedonidentitybystate AT decidrafael logratiobiplotapproachforexploringgeneticrelatednessbasedonidentitybystate AT barcelovidalcarles logratiobiplotapproachforexploringgeneticrelatednessbasedonidentitybystate |