Cargando…

Model based heritability scores for high-throughput sequencing data

BACKGROUND: Heritability of a phenotypic or molecular trait measures the proportion of variance that is attributable to genotypic variance. It is an important concept in breeding and genetics. Few methods are available for calculating heritability for traits derived from high-throughput sequencing....

Descripción completa

Detalles Bibliográficos
Autores principales: Rudra, Pratyaydipta, Shi, W. Jenny, Vestal, Brian, Russell, Pamela H., Odell, Aaron, Dowell, Robin D., Radcliffe, Richard A., Saba, Laura M., Kechris, Katerina
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5333443/
https://www.ncbi.nlm.nih.gov/pubmed/28253840
http://dx.doi.org/10.1186/s12859-017-1539-6
_version_ 1782511711552536576
author Rudra, Pratyaydipta
Shi, W. Jenny
Vestal, Brian
Russell, Pamela H.
Odell, Aaron
Dowell, Robin D.
Radcliffe, Richard A.
Saba, Laura M.
Kechris, Katerina
author_facet Rudra, Pratyaydipta
Shi, W. Jenny
Vestal, Brian
Russell, Pamela H.
Odell, Aaron
Dowell, Robin D.
Radcliffe, Richard A.
Saba, Laura M.
Kechris, Katerina
author_sort Rudra, Pratyaydipta
collection PubMed
description BACKGROUND: Heritability of a phenotypic or molecular trait measures the proportion of variance that is attributable to genotypic variance. It is an important concept in breeding and genetics. Few methods are available for calculating heritability for traits derived from high-throughput sequencing. RESULTS: We propose several statistical models and different methods to compute and test a heritability measure for such data based on linear and generalized linear mixed effects models. We also provide methodology for hypothesis testing and interval estimation. Our analyses show that, among the methods, the negative binomial mixed model (NB-fit), compound Poisson mixed model (CP-fit), and the variance stabilizing transformed linear mixed model (VST) outperform the voom-transformed linear mixed model (voom). NB-fit and VST appear to be more robust than CP-fit for estimating and testing the heritability scores, while NB-fit is the most computationally expensive. CP-fit performed best in terms of the coverage of the confidence intervals. In addition, we applied the methods to both microRNA (miRNA) and messenger RNA (mRNA) sequencing datasets from a recombinant inbred mouse panel. We show that miRNA and mRNA expression can be a highly heritable molecular trait in mouse, and that some top heritable features coincide with expression quantitative trait loci. CONCLUSIONS: The models and methods we investigated in this manuscript is applicable and extendable to sequencing experiments where some biological replicates are available and the environmental variation is properly controlled. The CP-fit approach for assessing heritability was implemented for the first time to our knowledge. All the methods presented, as well as the generation of simulated sequencing data under either negative binomial or compound Poisson mixed models, are provided in the R package HeritSeq. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1539-6) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5333443
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-53334432017-03-06 Model based heritability scores for high-throughput sequencing data Rudra, Pratyaydipta Shi, W. Jenny Vestal, Brian Russell, Pamela H. Odell, Aaron Dowell, Robin D. Radcliffe, Richard A. Saba, Laura M. Kechris, Katerina BMC Bioinformatics Methodology Article BACKGROUND: Heritability of a phenotypic or molecular trait measures the proportion of variance that is attributable to genotypic variance. It is an important concept in breeding and genetics. Few methods are available for calculating heritability for traits derived from high-throughput sequencing. RESULTS: We propose several statistical models and different methods to compute and test a heritability measure for such data based on linear and generalized linear mixed effects models. We also provide methodology for hypothesis testing and interval estimation. Our analyses show that, among the methods, the negative binomial mixed model (NB-fit), compound Poisson mixed model (CP-fit), and the variance stabilizing transformed linear mixed model (VST) outperform the voom-transformed linear mixed model (voom). NB-fit and VST appear to be more robust than CP-fit for estimating and testing the heritability scores, while NB-fit is the most computationally expensive. CP-fit performed best in terms of the coverage of the confidence intervals. In addition, we applied the methods to both microRNA (miRNA) and messenger RNA (mRNA) sequencing datasets from a recombinant inbred mouse panel. We show that miRNA and mRNA expression can be a highly heritable molecular trait in mouse, and that some top heritable features coincide with expression quantitative trait loci. CONCLUSIONS: The models and methods we investigated in this manuscript is applicable and extendable to sequencing experiments where some biological replicates are available and the environmental variation is properly controlled. The CP-fit approach for assessing heritability was implemented for the first time to our knowledge. All the methods presented, as well as the generation of simulated sequencing data under either negative binomial or compound Poisson mixed models, are provided in the R package HeritSeq. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1539-6) contains supplementary material, which is available to authorized users. BioMed Central 2017-03-02 /pmc/articles/PMC5333443/ /pubmed/28253840 http://dx.doi.org/10.1186/s12859-017-1539-6 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Rudra, Pratyaydipta
Shi, W. Jenny
Vestal, Brian
Russell, Pamela H.
Odell, Aaron
Dowell, Robin D.
Radcliffe, Richard A.
Saba, Laura M.
Kechris, Katerina
Model based heritability scores for high-throughput sequencing data
title Model based heritability scores for high-throughput sequencing data
title_full Model based heritability scores for high-throughput sequencing data
title_fullStr Model based heritability scores for high-throughput sequencing data
title_full_unstemmed Model based heritability scores for high-throughput sequencing data
title_short Model based heritability scores for high-throughput sequencing data
title_sort model based heritability scores for high-throughput sequencing data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5333443/
https://www.ncbi.nlm.nih.gov/pubmed/28253840
http://dx.doi.org/10.1186/s12859-017-1539-6
work_keys_str_mv AT rudrapratyaydipta modelbasedheritabilityscoresforhighthroughputsequencingdata
AT shiwjenny modelbasedheritabilityscoresforhighthroughputsequencingdata
AT vestalbrian modelbasedheritabilityscoresforhighthroughputsequencingdata
AT russellpamelah modelbasedheritabilityscoresforhighthroughputsequencingdata
AT odellaaron modelbasedheritabilityscoresforhighthroughputsequencingdata
AT dowellrobind modelbasedheritabilityscoresforhighthroughputsequencingdata
AT radcliffericharda modelbasedheritabilityscoresforhighthroughputsequencingdata
AT sabalauram modelbasedheritabilityscoresforhighthroughputsequencingdata
AT kechriskaterina modelbasedheritabilityscoresforhighthroughputsequencingdata