Cargando…
Model based heritability scores for high-throughput sequencing data
BACKGROUND: Heritability of a phenotypic or molecular trait measures the proportion of variance that is attributable to genotypic variance. It is an important concept in breeding and genetics. Few methods are available for calculating heritability for traits derived from high-throughput sequencing....
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5333443/ https://www.ncbi.nlm.nih.gov/pubmed/28253840 http://dx.doi.org/10.1186/s12859-017-1539-6 |
_version_ | 1782511711552536576 |
---|---|
author | Rudra, Pratyaydipta Shi, W. Jenny Vestal, Brian Russell, Pamela H. Odell, Aaron Dowell, Robin D. Radcliffe, Richard A. Saba, Laura M. Kechris, Katerina |
author_facet | Rudra, Pratyaydipta Shi, W. Jenny Vestal, Brian Russell, Pamela H. Odell, Aaron Dowell, Robin D. Radcliffe, Richard A. Saba, Laura M. Kechris, Katerina |
author_sort | Rudra, Pratyaydipta |
collection | PubMed |
description | BACKGROUND: Heritability of a phenotypic or molecular trait measures the proportion of variance that is attributable to genotypic variance. It is an important concept in breeding and genetics. Few methods are available for calculating heritability for traits derived from high-throughput sequencing. RESULTS: We propose several statistical models and different methods to compute and test a heritability measure for such data based on linear and generalized linear mixed effects models. We also provide methodology for hypothesis testing and interval estimation. Our analyses show that, among the methods, the negative binomial mixed model (NB-fit), compound Poisson mixed model (CP-fit), and the variance stabilizing transformed linear mixed model (VST) outperform the voom-transformed linear mixed model (voom). NB-fit and VST appear to be more robust than CP-fit for estimating and testing the heritability scores, while NB-fit is the most computationally expensive. CP-fit performed best in terms of the coverage of the confidence intervals. In addition, we applied the methods to both microRNA (miRNA) and messenger RNA (mRNA) sequencing datasets from a recombinant inbred mouse panel. We show that miRNA and mRNA expression can be a highly heritable molecular trait in mouse, and that some top heritable features coincide with expression quantitative trait loci. CONCLUSIONS: The models and methods we investigated in this manuscript is applicable and extendable to sequencing experiments where some biological replicates are available and the environmental variation is properly controlled. The CP-fit approach for assessing heritability was implemented for the first time to our knowledge. All the methods presented, as well as the generation of simulated sequencing data under either negative binomial or compound Poisson mixed models, are provided in the R package HeritSeq. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1539-6) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5333443 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-53334432017-03-06 Model based heritability scores for high-throughput sequencing data Rudra, Pratyaydipta Shi, W. Jenny Vestal, Brian Russell, Pamela H. Odell, Aaron Dowell, Robin D. Radcliffe, Richard A. Saba, Laura M. Kechris, Katerina BMC Bioinformatics Methodology Article BACKGROUND: Heritability of a phenotypic or molecular trait measures the proportion of variance that is attributable to genotypic variance. It is an important concept in breeding and genetics. Few methods are available for calculating heritability for traits derived from high-throughput sequencing. RESULTS: We propose several statistical models and different methods to compute and test a heritability measure for such data based on linear and generalized linear mixed effects models. We also provide methodology for hypothesis testing and interval estimation. Our analyses show that, among the methods, the negative binomial mixed model (NB-fit), compound Poisson mixed model (CP-fit), and the variance stabilizing transformed linear mixed model (VST) outperform the voom-transformed linear mixed model (voom). NB-fit and VST appear to be more robust than CP-fit for estimating and testing the heritability scores, while NB-fit is the most computationally expensive. CP-fit performed best in terms of the coverage of the confidence intervals. In addition, we applied the methods to both microRNA (miRNA) and messenger RNA (mRNA) sequencing datasets from a recombinant inbred mouse panel. We show that miRNA and mRNA expression can be a highly heritable molecular trait in mouse, and that some top heritable features coincide with expression quantitative trait loci. CONCLUSIONS: The models and methods we investigated in this manuscript is applicable and extendable to sequencing experiments where some biological replicates are available and the environmental variation is properly controlled. The CP-fit approach for assessing heritability was implemented for the first time to our knowledge. All the methods presented, as well as the generation of simulated sequencing data under either negative binomial or compound Poisson mixed models, are provided in the R package HeritSeq. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1539-6) contains supplementary material, which is available to authorized users. BioMed Central 2017-03-02 /pmc/articles/PMC5333443/ /pubmed/28253840 http://dx.doi.org/10.1186/s12859-017-1539-6 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Rudra, Pratyaydipta Shi, W. Jenny Vestal, Brian Russell, Pamela H. Odell, Aaron Dowell, Robin D. Radcliffe, Richard A. Saba, Laura M. Kechris, Katerina Model based heritability scores for high-throughput sequencing data |
title | Model based heritability scores for high-throughput sequencing data |
title_full | Model based heritability scores for high-throughput sequencing data |
title_fullStr | Model based heritability scores for high-throughput sequencing data |
title_full_unstemmed | Model based heritability scores for high-throughput sequencing data |
title_short | Model based heritability scores for high-throughput sequencing data |
title_sort | model based heritability scores for high-throughput sequencing data |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5333443/ https://www.ncbi.nlm.nih.gov/pubmed/28253840 http://dx.doi.org/10.1186/s12859-017-1539-6 |
work_keys_str_mv | AT rudrapratyaydipta modelbasedheritabilityscoresforhighthroughputsequencingdata AT shiwjenny modelbasedheritabilityscoresforhighthroughputsequencingdata AT vestalbrian modelbasedheritabilityscoresforhighthroughputsequencingdata AT russellpamelah modelbasedheritabilityscoresforhighthroughputsequencingdata AT odellaaron modelbasedheritabilityscoresforhighthroughputsequencingdata AT dowellrobind modelbasedheritabilityscoresforhighthroughputsequencingdata AT radcliffericharda modelbasedheritabilityscoresforhighthroughputsequencingdata AT sabalauram modelbasedheritabilityscoresforhighthroughputsequencingdata AT kechriskaterina modelbasedheritabilityscoresforhighthroughputsequencingdata |