Cargando…

Using population data for assessing next-generation sequencing performance

Motivation: During the past 4 years, whole-exome sequencing has become a standard tool for finding rare variants causing Mendelian disorders. In that time, there has also been a proliferation of both sequencing platforms and approaches to analyse their output. This requires approaches to assess the...

Descripción completa

Detalles Bibliográficos
Autores principales: Houniet, Darren T., Rahman, Thahira J., Al Turki, Saeed, Hurles, Matthew E., Xu, Yaobo, Goodship, Judith, Keavney, Bernard, Santibanez Koref, Mauro
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4271148/
https://www.ncbi.nlm.nih.gov/pubmed/25236458
http://dx.doi.org/10.1093/bioinformatics/btu606
_version_ 1782349558971367424
author Houniet, Darren T.
Rahman, Thahira J.
Al Turki, Saeed
Hurles, Matthew E.
Xu, Yaobo
Goodship, Judith
Keavney, Bernard
Santibanez Koref, Mauro
author_facet Houniet, Darren T.
Rahman, Thahira J.
Al Turki, Saeed
Hurles, Matthew E.
Xu, Yaobo
Goodship, Judith
Keavney, Bernard
Santibanez Koref, Mauro
author_sort Houniet, Darren T.
collection PubMed
description Motivation: During the past 4 years, whole-exome sequencing has become a standard tool for finding rare variants causing Mendelian disorders. In that time, there has also been a proliferation of both sequencing platforms and approaches to analyse their output. This requires approaches to assess the performance of different methods. Traditionally, criteria such as comparison with microarray data or a number of known polymorphic sites have been used. Here we expand such approaches, developing a maximum likelihood framework and using it to estimate the sensitivity and specificity of whole-exome sequencing data. Results: Using whole-exome sequencing data for a panel of 19 individuals, we show that estimated sensitivity and specificity are similar to those calculated using microarray data as a reference. We explore the effect of frequency misspecification arising from using an inappropriately selected population and find that, although the estimates are affected, the rankings across procedures remain the same. Availability and implementation: An implementation using Perl and R can be found at busso.ncl.ac.uk (Username: igm101; Password: Z1z1nts). Contact: Darren.Houniet@ogt.com; mauro.santibanez-koref@newcastle.ac.uk
format Online
Article
Text
id pubmed-4271148
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-42711482015-01-08 Using population data for assessing next-generation sequencing performance Houniet, Darren T. Rahman, Thahira J. Al Turki, Saeed Hurles, Matthew E. Xu, Yaobo Goodship, Judith Keavney, Bernard Santibanez Koref, Mauro Bioinformatics Original Papers Motivation: During the past 4 years, whole-exome sequencing has become a standard tool for finding rare variants causing Mendelian disorders. In that time, there has also been a proliferation of both sequencing platforms and approaches to analyse their output. This requires approaches to assess the performance of different methods. Traditionally, criteria such as comparison with microarray data or a number of known polymorphic sites have been used. Here we expand such approaches, developing a maximum likelihood framework and using it to estimate the sensitivity and specificity of whole-exome sequencing data. Results: Using whole-exome sequencing data for a panel of 19 individuals, we show that estimated sensitivity and specificity are similar to those calculated using microarray data as a reference. We explore the effect of frequency misspecification arising from using an inappropriately selected population and find that, although the estimates are affected, the rankings across procedures remain the same. Availability and implementation: An implementation using Perl and R can be found at busso.ncl.ac.uk (Username: igm101; Password: Z1z1nts). Contact: Darren.Houniet@ogt.com; mauro.santibanez-koref@newcastle.ac.uk Oxford University Press 2015-01-01 2014-09-17 /pmc/articles/PMC4271148/ /pubmed/25236458 http://dx.doi.org/10.1093/bioinformatics/btu606 Text en © The Author 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Houniet, Darren T.
Rahman, Thahira J.
Al Turki, Saeed
Hurles, Matthew E.
Xu, Yaobo
Goodship, Judith
Keavney, Bernard
Santibanez Koref, Mauro
Using population data for assessing next-generation sequencing performance
title Using population data for assessing next-generation sequencing performance
title_full Using population data for assessing next-generation sequencing performance
title_fullStr Using population data for assessing next-generation sequencing performance
title_full_unstemmed Using population data for assessing next-generation sequencing performance
title_short Using population data for assessing next-generation sequencing performance
title_sort using population data for assessing next-generation sequencing performance
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4271148/
https://www.ncbi.nlm.nih.gov/pubmed/25236458
http://dx.doi.org/10.1093/bioinformatics/btu606
work_keys_str_mv AT hounietdarrent usingpopulationdataforassessingnextgenerationsequencingperformance
AT rahmanthahiraj usingpopulationdataforassessingnextgenerationsequencingperformance
AT alturkisaeed usingpopulationdataforassessingnextgenerationsequencingperformance
AT hurlesmatthewe usingpopulationdataforassessingnextgenerationsequencingperformance
AT xuyaobo usingpopulationdataforassessingnextgenerationsequencingperformance
AT goodshipjudith usingpopulationdataforassessingnextgenerationsequencingperformance
AT keavneybernard usingpopulationdataforassessingnextgenerationsequencingperformance
AT santibanezkorefmauro usingpopulationdataforassessingnextgenerationsequencingperformance