Cargando…

Global inference of disease-causing single nucleotide variants from exome sequencing data

BACKGROUND: Whole exome sequencing (WES) has recently emerged as an effective approach for identifying genetic variants underlying human diseases. However, considerable time and labour is needed for careful investigation of candidate variants. Although filtration based on population frequencies and...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Mengmeng, Chen, Ting, Jiang, Rui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5260102/
https://www.ncbi.nlm.nih.gov/pubmed/28155632
http://dx.doi.org/10.1186/s12859-016-1325-x
_version_ 1782499343639511040
author Wu, Mengmeng
Chen, Ting
Jiang, Rui
author_facet Wu, Mengmeng
Chen, Ting
Jiang, Rui
author_sort Wu, Mengmeng
collection PubMed
description BACKGROUND: Whole exome sequencing (WES) has recently emerged as an effective approach for identifying genetic variants underlying human diseases. However, considerable time and labour is needed for careful investigation of candidate variants. Although filtration based on population frequencies and functional prediction scores could effectively remove common and neutral variants, hundreds or even thousands of rare deleterious variants still remain. In addition, current WES platforms also provide variant information in flanking noncoding regions, such as promoters, introns and splice sites. Despite of being recognized to harbour causal variants, these regions are usually ignored by current analysis pipelines. RESULTS: We present a novel computational method, called Glints, to overcome the above limitations. Glints is capable of identifying disease-causing SNVs in both coding and flanking noncoding regions from exome sequencing data. The principle behind Glints is that disease-causing variants should manifest their effect at both variant and gene levels. Specifically, Glints integrates 14 types of functional scores, including predictions for both coding and noncoding variants, and 9 types of association scores, which help identifying disease relevant genes. We conducted a large-scale simulation studies based on 1000 Genomes Project data and demonstrated the effectiveness of our method in both coding and flanking noncoding regions. We also applied Glints in two real exome sequencing and demonstrated its effectiveness for uncovering disease-causing SNVs. Both standalone software and web server are available at our website http://bioinfo.au.tsinghua.edu.cn/jianglab/glints. CONCLUSIONS: Glints is effective for uncovering disease-causing SNVs in coding and flanking noncoding regions, which is supported by both simulation and real case studies. Glints is expected to be a useful tool for human genetics research based on exome sequencing data. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1325-x) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5260102
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-52601022017-01-26 Global inference of disease-causing single nucleotide variants from exome sequencing data Wu, Mengmeng Chen, Ting Jiang, Rui BMC Bioinformatics Research BACKGROUND: Whole exome sequencing (WES) has recently emerged as an effective approach for identifying genetic variants underlying human diseases. However, considerable time and labour is needed for careful investigation of candidate variants. Although filtration based on population frequencies and functional prediction scores could effectively remove common and neutral variants, hundreds or even thousands of rare deleterious variants still remain. In addition, current WES platforms also provide variant information in flanking noncoding regions, such as promoters, introns and splice sites. Despite of being recognized to harbour causal variants, these regions are usually ignored by current analysis pipelines. RESULTS: We present a novel computational method, called Glints, to overcome the above limitations. Glints is capable of identifying disease-causing SNVs in both coding and flanking noncoding regions from exome sequencing data. The principle behind Glints is that disease-causing variants should manifest their effect at both variant and gene levels. Specifically, Glints integrates 14 types of functional scores, including predictions for both coding and noncoding variants, and 9 types of association scores, which help identifying disease relevant genes. We conducted a large-scale simulation studies based on 1000 Genomes Project data and demonstrated the effectiveness of our method in both coding and flanking noncoding regions. We also applied Glints in two real exome sequencing and demonstrated its effectiveness for uncovering disease-causing SNVs. Both standalone software and web server are available at our website http://bioinfo.au.tsinghua.edu.cn/jianglab/glints. CONCLUSIONS: Glints is effective for uncovering disease-causing SNVs in coding and flanking noncoding regions, which is supported by both simulation and real case studies. Glints is expected to be a useful tool for human genetics research based on exome sequencing data. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1325-x) contains supplementary material, which is available to authorized users. BioMed Central 2016-12-23 /pmc/articles/PMC5260102/ /pubmed/28155632 http://dx.doi.org/10.1186/s12859-016-1325-x Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Wu, Mengmeng
Chen, Ting
Jiang, Rui
Global inference of disease-causing single nucleotide variants from exome sequencing data
title Global inference of disease-causing single nucleotide variants from exome sequencing data
title_full Global inference of disease-causing single nucleotide variants from exome sequencing data
title_fullStr Global inference of disease-causing single nucleotide variants from exome sequencing data
title_full_unstemmed Global inference of disease-causing single nucleotide variants from exome sequencing data
title_short Global inference of disease-causing single nucleotide variants from exome sequencing data
title_sort global inference of disease-causing single nucleotide variants from exome sequencing data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5260102/
https://www.ncbi.nlm.nih.gov/pubmed/28155632
http://dx.doi.org/10.1186/s12859-016-1325-x
work_keys_str_mv AT wumengmeng globalinferenceofdiseasecausingsinglenucleotidevariantsfromexomesequencingdata
AT chenting globalinferenceofdiseasecausingsinglenucleotidevariantsfromexomesequencingdata
AT jiangrui globalinferenceofdiseasecausingsinglenucleotidevariantsfromexomesequencingdata