Cargando…
Using population data for assessing next-generation sequencing performance
Motivation: During the past 4 years, whole-exome sequencing has become a standard tool for finding rare variants causing Mendelian disorders. In that time, there has also been a proliferation of both sequencing platforms and approaches to analyse their output. This requires approaches to assess the...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4271148/ https://www.ncbi.nlm.nih.gov/pubmed/25236458 http://dx.doi.org/10.1093/bioinformatics/btu606 |
_version_ | 1782349558971367424 |
---|---|
author | Houniet, Darren T. Rahman, Thahira J. Al Turki, Saeed Hurles, Matthew E. Xu, Yaobo Goodship, Judith Keavney, Bernard Santibanez Koref, Mauro |
author_facet | Houniet, Darren T. Rahman, Thahira J. Al Turki, Saeed Hurles, Matthew E. Xu, Yaobo Goodship, Judith Keavney, Bernard Santibanez Koref, Mauro |
author_sort | Houniet, Darren T. |
collection | PubMed |
description | Motivation: During the past 4 years, whole-exome sequencing has become a standard tool for finding rare variants causing Mendelian disorders. In that time, there has also been a proliferation of both sequencing platforms and approaches to analyse their output. This requires approaches to assess the performance of different methods. Traditionally, criteria such as comparison with microarray data or a number of known polymorphic sites have been used. Here we expand such approaches, developing a maximum likelihood framework and using it to estimate the sensitivity and specificity of whole-exome sequencing data. Results: Using whole-exome sequencing data for a panel of 19 individuals, we show that estimated sensitivity and specificity are similar to those calculated using microarray data as a reference. We explore the effect of frequency misspecification arising from using an inappropriately selected population and find that, although the estimates are affected, the rankings across procedures remain the same. Availability and implementation: An implementation using Perl and R can be found at busso.ncl.ac.uk (Username: igm101; Password: Z1z1nts). Contact: Darren.Houniet@ogt.com; mauro.santibanez-koref@newcastle.ac.uk |
format | Online Article Text |
id | pubmed-4271148 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-42711482015-01-08 Using population data for assessing next-generation sequencing performance Houniet, Darren T. Rahman, Thahira J. Al Turki, Saeed Hurles, Matthew E. Xu, Yaobo Goodship, Judith Keavney, Bernard Santibanez Koref, Mauro Bioinformatics Original Papers Motivation: During the past 4 years, whole-exome sequencing has become a standard tool for finding rare variants causing Mendelian disorders. In that time, there has also been a proliferation of both sequencing platforms and approaches to analyse their output. This requires approaches to assess the performance of different methods. Traditionally, criteria such as comparison with microarray data or a number of known polymorphic sites have been used. Here we expand such approaches, developing a maximum likelihood framework and using it to estimate the sensitivity and specificity of whole-exome sequencing data. Results: Using whole-exome sequencing data for a panel of 19 individuals, we show that estimated sensitivity and specificity are similar to those calculated using microarray data as a reference. We explore the effect of frequency misspecification arising from using an inappropriately selected population and find that, although the estimates are affected, the rankings across procedures remain the same. Availability and implementation: An implementation using Perl and R can be found at busso.ncl.ac.uk (Username: igm101; Password: Z1z1nts). Contact: Darren.Houniet@ogt.com; mauro.santibanez-koref@newcastle.ac.uk Oxford University Press 2015-01-01 2014-09-17 /pmc/articles/PMC4271148/ /pubmed/25236458 http://dx.doi.org/10.1093/bioinformatics/btu606 Text en © The Author 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Papers Houniet, Darren T. Rahman, Thahira J. Al Turki, Saeed Hurles, Matthew E. Xu, Yaobo Goodship, Judith Keavney, Bernard Santibanez Koref, Mauro Using population data for assessing next-generation sequencing performance |
title | Using population data for assessing next-generation sequencing performance |
title_full | Using population data for assessing next-generation sequencing performance |
title_fullStr | Using population data for assessing next-generation sequencing performance |
title_full_unstemmed | Using population data for assessing next-generation sequencing performance |
title_short | Using population data for assessing next-generation sequencing performance |
title_sort | using population data for assessing next-generation sequencing performance |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4271148/ https://www.ncbi.nlm.nih.gov/pubmed/25236458 http://dx.doi.org/10.1093/bioinformatics/btu606 |
work_keys_str_mv | AT hounietdarrent usingpopulationdataforassessingnextgenerationsequencingperformance AT rahmanthahiraj usingpopulationdataforassessingnextgenerationsequencingperformance AT alturkisaeed usingpopulationdataforassessingnextgenerationsequencingperformance AT hurlesmatthewe usingpopulationdataforassessingnextgenerationsequencingperformance AT xuyaobo usingpopulationdataforassessingnextgenerationsequencingperformance AT goodshipjudith usingpopulationdataforassessingnextgenerationsequencingperformance AT keavneybernard usingpopulationdataforassessingnextgenerationsequencingperformance AT santibanezkorefmauro usingpopulationdataforassessingnextgenerationsequencingperformance |