Cargando…

Above and beyond state-of-the-art approaches to investigate sequence data: summary of methods and results from the population-based association group at the Genetic Analysis Workshop 19

This paper summarizes the contributions from the Population-Based Association group at the Genetic Analysis Workshop 19. It provides an overview of the new statistical approaches tried out by group members in order to take best advantage of population-based sequence data. Although contributions were...

Descripción completa

Detalles Bibliográficos
Autor principal: Lorenzo Bermejo, Justo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4895250/
https://www.ncbi.nlm.nih.gov/pubmed/26866664
http://dx.doi.org/10.1186/s12863-015-0310-0
_version_ 1782435810823372800
author Lorenzo Bermejo, Justo
author_facet Lorenzo Bermejo, Justo
author_sort Lorenzo Bermejo, Justo
collection PubMed
description This paper summarizes the contributions from the Population-Based Association group at the Genetic Analysis Workshop 19. It provides an overview of the new statistical approaches tried out by group members in order to take best advantage of population-based sequence data. Although contributions were highly heterogeneous regarding the applied quality control criteria and the number of investigated variants, several technical issues were identified, leading to practical recommendations. Preliminary analyses revealed that Hurdle-negative binomial regression is a promising approach to investigate the distribution of allele counts instead of called genotypes from sequence data. Convergence problems, however, limited the use of this approach, creating a technical challenge shared by environment-stratified models used to investigate rare variant-environment interactions, as well as by rare variant haplotype analyses using well-established public software. Estimates of relatedness and population structure strongly depended on the allele frequency of selected variants for inference. Another practical recommendation was that dissenting probability values from standard and small-sample tests of a particular hypothesis may reflect a lack of validity of large-sample approximations. Novel statistical approaches that integrate evolutionary information showed some advantage to detect weak genetic signals, and Bayesian adjustment for confounding was able to efficiently estimate causal genetic effects. Haplotype association methods may constitute a valuable complement of collapsing approaches for sequence data. This paper reports on the experience of members of the Population-Based Association group with several novel, promising approaches to preprocessing and analyzing sequence data, and to following up identified association signals.
format Online
Article
Text
id pubmed-4895250
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-48952502016-06-10 Above and beyond state-of-the-art approaches to investigate sequence data: summary of methods and results from the population-based association group at the Genetic Analysis Workshop 19 Lorenzo Bermejo, Justo BMC Genet Proceedings This paper summarizes the contributions from the Population-Based Association group at the Genetic Analysis Workshop 19. It provides an overview of the new statistical approaches tried out by group members in order to take best advantage of population-based sequence data. Although contributions were highly heterogeneous regarding the applied quality control criteria and the number of investigated variants, several technical issues were identified, leading to practical recommendations. Preliminary analyses revealed that Hurdle-negative binomial regression is a promising approach to investigate the distribution of allele counts instead of called genotypes from sequence data. Convergence problems, however, limited the use of this approach, creating a technical challenge shared by environment-stratified models used to investigate rare variant-environment interactions, as well as by rare variant haplotype analyses using well-established public software. Estimates of relatedness and population structure strongly depended on the allele frequency of selected variants for inference. Another practical recommendation was that dissenting probability values from standard and small-sample tests of a particular hypothesis may reflect a lack of validity of large-sample approximations. Novel statistical approaches that integrate evolutionary information showed some advantage to detect weak genetic signals, and Bayesian adjustment for confounding was able to efficiently estimate causal genetic effects. Haplotype association methods may constitute a valuable complement of collapsing approaches for sequence data. This paper reports on the experience of members of the Population-Based Association group with several novel, promising approaches to preprocessing and analyzing sequence data, and to following up identified association signals. BioMed Central 2016-02-03 /pmc/articles/PMC4895250/ /pubmed/26866664 http://dx.doi.org/10.1186/s12863-015-0310-0 Text en © Lorenzo Bermejo. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Proceedings
Lorenzo Bermejo, Justo
Above and beyond state-of-the-art approaches to investigate sequence data: summary of methods and results from the population-based association group at the Genetic Analysis Workshop 19
title Above and beyond state-of-the-art approaches to investigate sequence data: summary of methods and results from the population-based association group at the Genetic Analysis Workshop 19
title_full Above and beyond state-of-the-art approaches to investigate sequence data: summary of methods and results from the population-based association group at the Genetic Analysis Workshop 19
title_fullStr Above and beyond state-of-the-art approaches to investigate sequence data: summary of methods and results from the population-based association group at the Genetic Analysis Workshop 19
title_full_unstemmed Above and beyond state-of-the-art approaches to investigate sequence data: summary of methods and results from the population-based association group at the Genetic Analysis Workshop 19
title_short Above and beyond state-of-the-art approaches to investigate sequence data: summary of methods and results from the population-based association group at the Genetic Analysis Workshop 19
title_sort above and beyond state-of-the-art approaches to investigate sequence data: summary of methods and results from the population-based association group at the genetic analysis workshop 19
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4895250/
https://www.ncbi.nlm.nih.gov/pubmed/26866664
http://dx.doi.org/10.1186/s12863-015-0310-0
work_keys_str_mv AT lorenzobermejojusto aboveandbeyondstateoftheartapproachestoinvestigatesequencedatasummaryofmethodsandresultsfromthepopulationbasedassociationgroupatthegeneticanalysisworkshop19