Cargando…

Joint genotyping on the fly: Identifying variation among a sequenced panel of inbred lines

High-throughput sequencing is enabling remarkably deep surveys of genomic variation. It is now possible to completely sequence multiple individuals from a single species, yet the identification of variation among them remains an evolving computational challenge. This challenge is compounded for expe...

Descripción completa

Detalles Bibliográficos
Autor principal: Stone, Eric A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3337441/
https://www.ncbi.nlm.nih.gov/pubmed/22367192
http://dx.doi.org/10.1101/gr.129122.111
_version_ 1782231079535509504
author Stone, Eric A.
author_facet Stone, Eric A.
author_sort Stone, Eric A.
collection PubMed
description High-throughput sequencing is enabling remarkably deep surveys of genomic variation. It is now possible to completely sequence multiple individuals from a single species, yet the identification of variation among them remains an evolving computational challenge. This challenge is compounded for experimental organisms when strains are studied instead of individuals. In response, we present the Joint Genotyper for Inbred Lines (JGIL) as a method for obtaining genotypes and identifying variation among a large panel of inbred strains or lines. JGIL inputs the sequence reads from each line after their alignment to a common reference. Its probabilistic model includes site-specific parameters common to all lines that describe the frequency of nucleotides segregating in the population from which the inbred panel was derived. The distribution of line genotypes is conditional on these parameters and reflects the experimental design. Site-specific error probabilities, also common to all lines, parameterize the distribution of reads conditional on line genotype and realized coverage. Both sets of parameters are estimated per site from the aggregate read data, and posterior probabilities are calculated to decode the genotype of each line. We present an application of JGIL to 162 inbred Drosophila melanogaster lines from the Drosophila Genetic Reference Panel. We explore by simulation the effect of varying coverage, sequencing error, mapping error, and the number of lines. In doing so, we illustrate how JGIL is robust to moderate levels of error. Supported by these analyses, we advocate the importance of modeling the data and the experimental design when possible.
format Online
Article
Text
id pubmed-3337441
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-33374412012-11-01 Joint genotyping on the fly: Identifying variation among a sequenced panel of inbred lines Stone, Eric A. Genome Res Method High-throughput sequencing is enabling remarkably deep surveys of genomic variation. It is now possible to completely sequence multiple individuals from a single species, yet the identification of variation among them remains an evolving computational challenge. This challenge is compounded for experimental organisms when strains are studied instead of individuals. In response, we present the Joint Genotyper for Inbred Lines (JGIL) as a method for obtaining genotypes and identifying variation among a large panel of inbred strains or lines. JGIL inputs the sequence reads from each line after their alignment to a common reference. Its probabilistic model includes site-specific parameters common to all lines that describe the frequency of nucleotides segregating in the population from which the inbred panel was derived. The distribution of line genotypes is conditional on these parameters and reflects the experimental design. Site-specific error probabilities, also common to all lines, parameterize the distribution of reads conditional on line genotype and realized coverage. Both sets of parameters are estimated per site from the aggregate read data, and posterior probabilities are calculated to decode the genotype of each line. We present an application of JGIL to 162 inbred Drosophila melanogaster lines from the Drosophila Genetic Reference Panel. We explore by simulation the effect of varying coverage, sequencing error, mapping error, and the number of lines. In doing so, we illustrate how JGIL is robust to moderate levels of error. Supported by these analyses, we advocate the importance of modeling the data and the experimental design when possible. Cold Spring Harbor Laboratory Press 2012-05 /pmc/articles/PMC3337441/ /pubmed/22367192 http://dx.doi.org/10.1101/gr.129122.111 Text en © 2012, Published by Cold Spring Harbor Laboratory Press This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at http://creativecommons.org/licenses/by-nc/3.0/.
spellingShingle Method
Stone, Eric A.
Joint genotyping on the fly: Identifying variation among a sequenced panel of inbred lines
title Joint genotyping on the fly: Identifying variation among a sequenced panel of inbred lines
title_full Joint genotyping on the fly: Identifying variation among a sequenced panel of inbred lines
title_fullStr Joint genotyping on the fly: Identifying variation among a sequenced panel of inbred lines
title_full_unstemmed Joint genotyping on the fly: Identifying variation among a sequenced panel of inbred lines
title_short Joint genotyping on the fly: Identifying variation among a sequenced panel of inbred lines
title_sort joint genotyping on the fly: identifying variation among a sequenced panel of inbred lines
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3337441/
https://www.ncbi.nlm.nih.gov/pubmed/22367192
http://dx.doi.org/10.1101/gr.129122.111
work_keys_str_mv AT stoneerica jointgenotypingontheflyidentifyingvariationamongasequencedpanelofinbredlines