Cargando…

Exploring functional variant discovery in non-coding regions with SInBaD

The thousand genomes project and many similar ongoing large-scale sequencing efforts require new methods to predict functional variants in both coding and non-coding regions in order to understand phenotype and genotype relationships. We report the design of a new model SInBaD (Sequence-Information-...

Descripción completa

Detalles Bibliográficos
Autores principales: Lehmann, Kjong-Van, Chen, Ting
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3592431/
https://www.ncbi.nlm.nih.gov/pubmed/22941663
http://dx.doi.org/10.1093/nar/gks800
_version_ 1782262114393522176
author Lehmann, Kjong-Van
Chen, Ting
author_facet Lehmann, Kjong-Van
Chen, Ting
author_sort Lehmann, Kjong-Van
collection PubMed
description The thousand genomes project and many similar ongoing large-scale sequencing efforts require new methods to predict functional variants in both coding and non-coding regions in order to understand phenotype and genotype relationships. We report the design of a new model SInBaD (Sequence-Information-Based-Decision-model) which relies on nucleotide conservation information to evaluate any annotated human variant in all known exons, introns, splice junctions and promoter regions. SInBaD builds separate mathematical models for promoters, exons and introns, using the human disease mutations annotated in human gene mutation database as the training dataset for functional variants. The ten-fold cross validation shows high prediction accuracy. Validations on test datasets, demonstrate that variants predicted as functional have a significantly higher occurrence in cancer patients. We also applied our model to variants found in four different individual human genomes to identify a set of functional variants, which might be of interest for further studies. Scores for any possible variants for all annotated genes are available under http://tingchenlab.cmb.usc.edu/sinbad/. SInBaD supports the current standard format of genotyping, the variant call files (VCF 4.0), making it easy to integrate it into any existing next-generation sequencing pipeline. The accuracy of SNP detection poses the only limitation to the use of SInBaD.
format Online
Article
Text
id pubmed-3592431
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-35924312013-03-08 Exploring functional variant discovery in non-coding regions with SInBaD Lehmann, Kjong-Van Chen, Ting Nucleic Acids Res Methods Online The thousand genomes project and many similar ongoing large-scale sequencing efforts require new methods to predict functional variants in both coding and non-coding regions in order to understand phenotype and genotype relationships. We report the design of a new model SInBaD (Sequence-Information-Based-Decision-model) which relies on nucleotide conservation information to evaluate any annotated human variant in all known exons, introns, splice junctions and promoter regions. SInBaD builds separate mathematical models for promoters, exons and introns, using the human disease mutations annotated in human gene mutation database as the training dataset for functional variants. The ten-fold cross validation shows high prediction accuracy. Validations on test datasets, demonstrate that variants predicted as functional have a significantly higher occurrence in cancer patients. We also applied our model to variants found in four different individual human genomes to identify a set of functional variants, which might be of interest for further studies. Scores for any possible variants for all annotated genes are available under http://tingchenlab.cmb.usc.edu/sinbad/. SInBaD supports the current standard format of genotyping, the variant call files (VCF 4.0), making it easy to integrate it into any existing next-generation sequencing pipeline. The accuracy of SNP detection poses the only limitation to the use of SInBaD. Oxford University Press 2013-01 2012-08-30 /pmc/articles/PMC3592431/ /pubmed/22941663 http://dx.doi.org/10.1093/nar/gks800 Text en © The Author(s) 2012. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Lehmann, Kjong-Van
Chen, Ting
Exploring functional variant discovery in non-coding regions with SInBaD
title Exploring functional variant discovery in non-coding regions with SInBaD
title_full Exploring functional variant discovery in non-coding regions with SInBaD
title_fullStr Exploring functional variant discovery in non-coding regions with SInBaD
title_full_unstemmed Exploring functional variant discovery in non-coding regions with SInBaD
title_short Exploring functional variant discovery in non-coding regions with SInBaD
title_sort exploring functional variant discovery in non-coding regions with sinbad
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3592431/
https://www.ncbi.nlm.nih.gov/pubmed/22941663
http://dx.doi.org/10.1093/nar/gks800
work_keys_str_mv AT lehmannkjongvan exploringfunctionalvariantdiscoveryinnoncodingregionswithsinbad
AT chenting exploringfunctionalvariantdiscoveryinnoncodingregionswithsinbad