Cargando…
TMC-SNPdb: an Indian germline variant database derived from whole exome sequences
Cancer is predominantly a somatic disease. A mutant allele present in a cancer cell genome is considered somatic when it’s absent in the paired normal genome along with public SNP databases. The current build of dbSNP, the most comprehensive public SNP database, however inadequately represents sever...
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4940432/ https://www.ncbi.nlm.nih.gov/pubmed/27402678 http://dx.doi.org/10.1093/database/baw104 |
_version_ | 1782442143346851840 |
---|---|
author | Upadhyay, Pawan Gardi, Nilesh Desai, Sanket Sahoo, Bikram Singh, Ankita Togar, Trupti Iyer, Prajish Prasad, Ratnam Chandrani, Pratik Gupta, Sudeep Dutt, Amit |
author_facet | Upadhyay, Pawan Gardi, Nilesh Desai, Sanket Sahoo, Bikram Singh, Ankita Togar, Trupti Iyer, Prajish Prasad, Ratnam Chandrani, Pratik Gupta, Sudeep Dutt, Amit |
author_sort | Upadhyay, Pawan |
collection | PubMed |
description | Cancer is predominantly a somatic disease. A mutant allele present in a cancer cell genome is considered somatic when it’s absent in the paired normal genome along with public SNP databases. The current build of dbSNP, the most comprehensive public SNP database, however inadequately represents several non-European Caucasian populations, posing a limitation in cancer genomic analyses of data from these populations. We present the Tata Memorial Centre-SNP database (TMC-SNPdb), as the first open source, flexible, upgradable, and freely available SNP database (accessible through dbSNP build 149 and ANNOVAR)—representing 114 309 unique germline variants—generated from whole exome data of 62 normal samples derived from cancer patients of Indian origin. The TMC-SNPdb is presented with a companion subtraction tool that can be executed with command line option or using an easy-to-use graphical user interface with the ability to deplete additional Indian population specific SNPs over and above dbSNP and 1000 Genomes databases. Using an institutional generated whole exome data set of 132 samples of Indian origin, we demonstrate that TMC-SNPdb could deplete 42, 33 and 28% false positive somatic events post dbSNP depletion in Indian origin tongue, gallbladder, and cervical cancer samples, respectively. Beyond cancer somatic analyses, we anticipate utility of the TMC-SNPdb in several Mendelian germline diseases. In addition to dbSNP build 149 and ANNOVAR, the TMC-SNPdb along with the subtraction tool is available for download in the public domain at the following: Database URL: http://www.actrec.gov.in/pi-webpages/AmitDutt/TMCSNP/TMCSNPdp.html |
format | Online Article Text |
id | pubmed-4940432 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-49404322016-07-13 TMC-SNPdb: an Indian germline variant database derived from whole exome sequences Upadhyay, Pawan Gardi, Nilesh Desai, Sanket Sahoo, Bikram Singh, Ankita Togar, Trupti Iyer, Prajish Prasad, Ratnam Chandrani, Pratik Gupta, Sudeep Dutt, Amit Database (Oxford) Original Article Cancer is predominantly a somatic disease. A mutant allele present in a cancer cell genome is considered somatic when it’s absent in the paired normal genome along with public SNP databases. The current build of dbSNP, the most comprehensive public SNP database, however inadequately represents several non-European Caucasian populations, posing a limitation in cancer genomic analyses of data from these populations. We present the Tata Memorial Centre-SNP database (TMC-SNPdb), as the first open source, flexible, upgradable, and freely available SNP database (accessible through dbSNP build 149 and ANNOVAR)—representing 114 309 unique germline variants—generated from whole exome data of 62 normal samples derived from cancer patients of Indian origin. The TMC-SNPdb is presented with a companion subtraction tool that can be executed with command line option or using an easy-to-use graphical user interface with the ability to deplete additional Indian population specific SNPs over and above dbSNP and 1000 Genomes databases. Using an institutional generated whole exome data set of 132 samples of Indian origin, we demonstrate that TMC-SNPdb could deplete 42, 33 and 28% false positive somatic events post dbSNP depletion in Indian origin tongue, gallbladder, and cervical cancer samples, respectively. Beyond cancer somatic analyses, we anticipate utility of the TMC-SNPdb in several Mendelian germline diseases. In addition to dbSNP build 149 and ANNOVAR, the TMC-SNPdb along with the subtraction tool is available for download in the public domain at the following: Database URL: http://www.actrec.gov.in/pi-webpages/AmitDutt/TMCSNP/TMCSNPdp.html Oxford University Press 2016-07-09 /pmc/articles/PMC4940432/ /pubmed/27402678 http://dx.doi.org/10.1093/database/baw104 Text en © The Author(s) 2016. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Article Upadhyay, Pawan Gardi, Nilesh Desai, Sanket Sahoo, Bikram Singh, Ankita Togar, Trupti Iyer, Prajish Prasad, Ratnam Chandrani, Pratik Gupta, Sudeep Dutt, Amit TMC-SNPdb: an Indian germline variant database derived from whole exome sequences |
title | TMC-SNPdb: an Indian germline variant database derived from whole exome sequences |
title_full | TMC-SNPdb: an Indian germline variant database derived from whole exome sequences |
title_fullStr | TMC-SNPdb: an Indian germline variant database derived from whole exome sequences |
title_full_unstemmed | TMC-SNPdb: an Indian germline variant database derived from whole exome sequences |
title_short | TMC-SNPdb: an Indian germline variant database derived from whole exome sequences |
title_sort | tmc-snpdb: an indian germline variant database derived from whole exome sequences |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4940432/ https://www.ncbi.nlm.nih.gov/pubmed/27402678 http://dx.doi.org/10.1093/database/baw104 |
work_keys_str_mv | AT upadhyaypawan tmcsnpdbanindiangermlinevariantdatabasederivedfromwholeexomesequences AT gardinilesh tmcsnpdbanindiangermlinevariantdatabasederivedfromwholeexomesequences AT desaisanket tmcsnpdbanindiangermlinevariantdatabasederivedfromwholeexomesequences AT sahoobikram tmcsnpdbanindiangermlinevariantdatabasederivedfromwholeexomesequences AT singhankita tmcsnpdbanindiangermlinevariantdatabasederivedfromwholeexomesequences AT togartrupti tmcsnpdbanindiangermlinevariantdatabasederivedfromwholeexomesequences AT iyerprajish tmcsnpdbanindiangermlinevariantdatabasederivedfromwholeexomesequences AT prasadratnam tmcsnpdbanindiangermlinevariantdatabasederivedfromwholeexomesequences AT chandranipratik tmcsnpdbanindiangermlinevariantdatabasederivedfromwholeexomesequences AT guptasudeep tmcsnpdbanindiangermlinevariantdatabasederivedfromwholeexomesequences AT duttamit tmcsnpdbanindiangermlinevariantdatabasederivedfromwholeexomesequences |