Cargando…

TMC-SNPdb: an Indian germline variant database derived from whole exome sequences

Cancer is predominantly a somatic disease. A mutant allele present in a cancer cell genome is considered somatic when it’s absent in the paired normal genome along with public SNP databases. The current build of dbSNP, the most comprehensive public SNP database, however inadequately represents sever...

Descripción completa

Detalles Bibliográficos
Autores principales: Upadhyay, Pawan, Gardi, Nilesh, Desai, Sanket, Sahoo, Bikram, Singh, Ankita, Togar, Trupti, Iyer, Prajish, Prasad, Ratnam, Chandrani, Pratik, Gupta, Sudeep, Dutt, Amit
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4940432/
https://www.ncbi.nlm.nih.gov/pubmed/27402678
http://dx.doi.org/10.1093/database/baw104
_version_ 1782442143346851840
author Upadhyay, Pawan
Gardi, Nilesh
Desai, Sanket
Sahoo, Bikram
Singh, Ankita
Togar, Trupti
Iyer, Prajish
Prasad, Ratnam
Chandrani, Pratik
Gupta, Sudeep
Dutt, Amit
author_facet Upadhyay, Pawan
Gardi, Nilesh
Desai, Sanket
Sahoo, Bikram
Singh, Ankita
Togar, Trupti
Iyer, Prajish
Prasad, Ratnam
Chandrani, Pratik
Gupta, Sudeep
Dutt, Amit
author_sort Upadhyay, Pawan
collection PubMed
description Cancer is predominantly a somatic disease. A mutant allele present in a cancer cell genome is considered somatic when it’s absent in the paired normal genome along with public SNP databases. The current build of dbSNP, the most comprehensive public SNP database, however inadequately represents several non-European Caucasian populations, posing a limitation in cancer genomic analyses of data from these populations. We present the Tata Memorial Centre-SNP database (TMC-SNPdb), as the first open source, flexible, upgradable, and freely available SNP database (accessible through dbSNP build 149 and ANNOVAR)—representing 114 309 unique germline variants—generated from whole exome data of 62 normal samples derived from cancer patients of Indian origin. The TMC-SNPdb is presented with a companion subtraction tool that can be executed with command line option or using an easy-to-use graphical user interface with the ability to deplete additional Indian population specific SNPs over and above dbSNP and 1000 Genomes databases. Using an institutional generated whole exome data set of 132 samples of Indian origin, we demonstrate that TMC-SNPdb could deplete 42, 33 and 28% false positive somatic events post dbSNP depletion in Indian origin tongue, gallbladder, and cervical cancer samples, respectively. Beyond cancer somatic analyses, we anticipate utility of the TMC-SNPdb in several Mendelian germline diseases. In addition to dbSNP build 149 and ANNOVAR, the TMC-SNPdb along with the subtraction tool is available for download in the public domain at the following: Database URL: http://www.actrec.gov.in/pi-webpages/AmitDutt/TMCSNP/TMCSNPdp.html
format Online
Article
Text
id pubmed-4940432
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-49404322016-07-13 TMC-SNPdb: an Indian germline variant database derived from whole exome sequences Upadhyay, Pawan Gardi, Nilesh Desai, Sanket Sahoo, Bikram Singh, Ankita Togar, Trupti Iyer, Prajish Prasad, Ratnam Chandrani, Pratik Gupta, Sudeep Dutt, Amit Database (Oxford) Original Article Cancer is predominantly a somatic disease. A mutant allele present in a cancer cell genome is considered somatic when it’s absent in the paired normal genome along with public SNP databases. The current build of dbSNP, the most comprehensive public SNP database, however inadequately represents several non-European Caucasian populations, posing a limitation in cancer genomic analyses of data from these populations. We present the Tata Memorial Centre-SNP database (TMC-SNPdb), as the first open source, flexible, upgradable, and freely available SNP database (accessible through dbSNP build 149 and ANNOVAR)—representing 114 309 unique germline variants—generated from whole exome data of 62 normal samples derived from cancer patients of Indian origin. The TMC-SNPdb is presented with a companion subtraction tool that can be executed with command line option or using an easy-to-use graphical user interface with the ability to deplete additional Indian population specific SNPs over and above dbSNP and 1000 Genomes databases. Using an institutional generated whole exome data set of 132 samples of Indian origin, we demonstrate that TMC-SNPdb could deplete 42, 33 and 28% false positive somatic events post dbSNP depletion in Indian origin tongue, gallbladder, and cervical cancer samples, respectively. Beyond cancer somatic analyses, we anticipate utility of the TMC-SNPdb in several Mendelian germline diseases. In addition to dbSNP build 149 and ANNOVAR, the TMC-SNPdb along with the subtraction tool is available for download in the public domain at the following: Database URL: http://www.actrec.gov.in/pi-webpages/AmitDutt/TMCSNP/TMCSNPdp.html Oxford University Press 2016-07-09 /pmc/articles/PMC4940432/ /pubmed/27402678 http://dx.doi.org/10.1093/database/baw104 Text en © The Author(s) 2016. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Upadhyay, Pawan
Gardi, Nilesh
Desai, Sanket
Sahoo, Bikram
Singh, Ankita
Togar, Trupti
Iyer, Prajish
Prasad, Ratnam
Chandrani, Pratik
Gupta, Sudeep
Dutt, Amit
TMC-SNPdb: an Indian germline variant database derived from whole exome sequences
title TMC-SNPdb: an Indian germline variant database derived from whole exome sequences
title_full TMC-SNPdb: an Indian germline variant database derived from whole exome sequences
title_fullStr TMC-SNPdb: an Indian germline variant database derived from whole exome sequences
title_full_unstemmed TMC-SNPdb: an Indian germline variant database derived from whole exome sequences
title_short TMC-SNPdb: an Indian germline variant database derived from whole exome sequences
title_sort tmc-snpdb: an indian germline variant database derived from whole exome sequences
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4940432/
https://www.ncbi.nlm.nih.gov/pubmed/27402678
http://dx.doi.org/10.1093/database/baw104
work_keys_str_mv AT upadhyaypawan tmcsnpdbanindiangermlinevariantdatabasederivedfromwholeexomesequences
AT gardinilesh tmcsnpdbanindiangermlinevariantdatabasederivedfromwholeexomesequences
AT desaisanket tmcsnpdbanindiangermlinevariantdatabasederivedfromwholeexomesequences
AT sahoobikram tmcsnpdbanindiangermlinevariantdatabasederivedfromwholeexomesequences
AT singhankita tmcsnpdbanindiangermlinevariantdatabasederivedfromwholeexomesequences
AT togartrupti tmcsnpdbanindiangermlinevariantdatabasederivedfromwholeexomesequences
AT iyerprajish tmcsnpdbanindiangermlinevariantdatabasederivedfromwholeexomesequences
AT prasadratnam tmcsnpdbanindiangermlinevariantdatabasederivedfromwholeexomesequences
AT chandranipratik tmcsnpdbanindiangermlinevariantdatabasederivedfromwholeexomesequences
AT guptasudeep tmcsnpdbanindiangermlinevariantdatabasederivedfromwholeexomesequences
AT duttamit tmcsnpdbanindiangermlinevariantdatabasederivedfromwholeexomesequences