Cargando…

On the Aggregation of Multimarker Information for Marker-Set and Sequencing Data Analysis: Genotype Collapsing vs. Similarity Collapsing

Methods that collapse information across genetic markers when searching for association signals are gaining momentum in the literature. Although originally developed to achieve a better balance between retaining information and controlling degrees of freedom when performing multimarker association a...

Descripción completa

Detalles Bibliográficos
Autores principales:	Pongpanich, Monnat, Neely, Megan L., Tzeng, Jung-Ying
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Research Foundation 2012
Materias:	Genetics
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3266618/ https://www.ncbi.nlm.nih.gov/pubmed/22303404 http://dx.doi.org/10.3389/fgene.2011.00110

_version_	1782222197967814656
author	Pongpanich, Monnat Neely, Megan L. Tzeng, Jung-Ying
author_facet	Pongpanich, Monnat Neely, Megan L. Tzeng, Jung-Ying
author_sort	Pongpanich, Monnat
collection	PubMed
description	Methods that collapse information across genetic markers when searching for association signals are gaining momentum in the literature. Although originally developed to achieve a better balance between retaining information and controlling degrees of freedom when performing multimarker association analysis, these methods have recently been proven to be a powerful tool for identifying rare variants that contribute to complex phenotypes. The information among markers can be collapsed at the genotype level, which focuses on the mean of genetic information, or the similarity level, which focuses on the variance of genetic information. The aim of this work is to understand the strengths and weaknesses of these two collapsing strategies. Our results show that neither collapsing strategy outperforms the other across all simulated scenarios. Two factors that dominate the performance of these strategies are the signal-to-noise ratio and the underlying genetic architecture of the causal variants. Genotype collapsing is more sensitive to the marker set being contaminated by noise loci than similarity collapsing. In addition, genotype collapsing performs best when the genetic architecture of the causal variants is not complex (e.g., causal loci with similar effects and similar frequencies). Similarity collapsing is more robust as the complexity of the genetic architecture increases and outperforms genotype collapsing when the genetic architecture of the marker set becomes more sophisticated (e.g., causal loci with various effect sizes or frequencies and potential non-linear or interactive effects). Because the underlying genetic architecture is not known a priori, we also considered a two-stage analysis that combines the two top-performing methods from different collapsing strategies. We find that it is reasonably robust across all simulated scenarios.
format	Online Article Text
id	pubmed-3266618
institution	National Center for Biotechnology Information
language	English
publishDate	2012
publisher	Frontiers Research Foundation
record_format	MEDLINE/PubMed
spelling	pubmed-32666182012-02-02 On the Aggregation of Multimarker Information for Marker-Set and Sequencing Data Analysis: Genotype Collapsing vs. Similarity Collapsing Pongpanich, Monnat Neely, Megan L. Tzeng, Jung-Ying Front Genet Genetics Methods that collapse information across genetic markers when searching for association signals are gaining momentum in the literature. Although originally developed to achieve a better balance between retaining information and controlling degrees of freedom when performing multimarker association analysis, these methods have recently been proven to be a powerful tool for identifying rare variants that contribute to complex phenotypes. The information among markers can be collapsed at the genotype level, which focuses on the mean of genetic information, or the similarity level, which focuses on the variance of genetic information. The aim of this work is to understand the strengths and weaknesses of these two collapsing strategies. Our results show that neither collapsing strategy outperforms the other across all simulated scenarios. Two factors that dominate the performance of these strategies are the signal-to-noise ratio and the underlying genetic architecture of the causal variants. Genotype collapsing is more sensitive to the marker set being contaminated by noise loci than similarity collapsing. In addition, genotype collapsing performs best when the genetic architecture of the causal variants is not complex (e.g., causal loci with similar effects and similar frequencies). Similarity collapsing is more robust as the complexity of the genetic architecture increases and outperforms genotype collapsing when the genetic architecture of the marker set becomes more sophisticated (e.g., causal loci with various effect sizes or frequencies and potential non-linear or interactive effects). Because the underlying genetic architecture is not known a priori, we also considered a two-stage analysis that combines the two top-performing methods from different collapsing strategies. We find that it is reasonably robust across all simulated scenarios. Frontiers Research Foundation 2012-01-09 /pmc/articles/PMC3266618/ /pubmed/22303404 http://dx.doi.org/10.3389/fgene.2011.00110 Text en Copyright © 2012 Pongpanich, Neely and Tzeng. http://www.frontiersin.org/licenseagreement This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.
spellingShingle	Genetics Pongpanich, Monnat Neely, Megan L. Tzeng, Jung-Ying On the Aggregation of Multimarker Information for Marker-Set and Sequencing Data Analysis: Genotype Collapsing vs. Similarity Collapsing
title	On the Aggregation of Multimarker Information for Marker-Set and Sequencing Data Analysis: Genotype Collapsing vs. Similarity Collapsing
title_full	On the Aggregation of Multimarker Information for Marker-Set and Sequencing Data Analysis: Genotype Collapsing vs. Similarity Collapsing
title_fullStr	On the Aggregation of Multimarker Information for Marker-Set and Sequencing Data Analysis: Genotype Collapsing vs. Similarity Collapsing
title_full_unstemmed	On the Aggregation of Multimarker Information for Marker-Set and Sequencing Data Analysis: Genotype Collapsing vs. Similarity Collapsing
title_short	On the Aggregation of Multimarker Information for Marker-Set and Sequencing Data Analysis: Genotype Collapsing vs. Similarity Collapsing
title_sort	on the aggregation of multimarker information for marker-set and sequencing data analysis: genotype collapsing vs. similarity collapsing
topic	Genetics
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3266618/ https://www.ncbi.nlm.nih.gov/pubmed/22303404 http://dx.doi.org/10.3389/fgene.2011.00110
work_keys_str_mv	AT pongpanichmonnat ontheaggregationofmultimarkerinformationformarkersetandsequencingdataanalysisgenotypecollapsingvssimilaritycollapsing AT neelymeganl ontheaggregationofmultimarkerinformationformarkersetandsequencingdataanalysisgenotypecollapsingvssimilaritycollapsing AT tzengjungying ontheaggregationofmultimarkerinformationformarkersetandsequencingdataanalysisgenotypecollapsingvssimilaritycollapsing

On the Aggregation of Multimarker Information for Marker-Set and Sequencing Data Analysis: Genotype Collapsing vs. Similarity Collapsing

Ejemplares similares