Cargando…
dbVar structural variant cluster set for data analysis and variant comparison
dbVar houses over 3 million submitted structural variants (SSV) from 120 human studies including copy number variations (CNV), insertions, deletions, inversions, translocations, and complex chromosomal rearrangements. Users can submit multiple SSVs to dbVAR that are presumably identical, but were a...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
F1000Research
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5345777/ https://www.ncbi.nlm.nih.gov/pubmed/28357035 http://dx.doi.org/10.12688/f1000research.8290.2 |
_version_ | 1782513778861015040 |
---|---|
author | Phan, Lon Hsu, Jeffrey Tri, Le Quang Minh Willi, Michaela Mansour, Tamer Kai, Yan Garner, John Lopez, John Busby, Ben |
author_facet | Phan, Lon Hsu, Jeffrey Tri, Le Quang Minh Willi, Michaela Mansour, Tamer Kai, Yan Garner, John Lopez, John Busby, Ben |
author_sort | Phan, Lon |
collection | PubMed |
description | dbVar houses over 3 million submitted structural variants (SSV) from 120 human studies including copy number variations (CNV), insertions, deletions, inversions, translocations, and complex chromosomal rearrangements. Users can submit multiple SSVs to dbVAR that are presumably identical, but were ascertained by different platforms and samples, to calculate whether the variant is rare or common in the population and allow for cross validation. However, because SSV genomic location reporting can vary – including fuzzy locations where the start and/or end points are not precisely known – analysis, comparison, annotation, and reporting of SSVs across studies can be difficult. This project was initiated by the Structural Variant Comparison Group for the purpose of generating a non-redundant set of genomic regions defined by counts of concordance for all human SSVs placed on RefSeq assembly GRCh38 (RefSeq accession GCF_000001405.26). We intend that the availability of these regions, called structural variant clusters (SVCs), will facilitate the analysis, annotation, and exchange of SV data and allow for simplified display in genomic sequence viewers for improved variant interpretation. Sets of SVCs were generated by variant type for each of the 120 studies as well as for a combined set across all studies. Starting from 3.64 million SSVs, 2.5 million and 3.4 million non-redundant SVCs with count >=1 were generated by variant type for each study and across all studies, respectively. In addition, we have developed utilities for annotating, searching, and filtering SVC data in GVF format for computing summary statistics, exporting data for genomic viewers, and annotating the SVC using external data sources. |
format | Online Article Text |
id | pubmed-5345777 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | F1000Research |
record_format | MEDLINE/PubMed |
spelling | pubmed-53457772017-03-28 dbVar structural variant cluster set for data analysis and variant comparison Phan, Lon Hsu, Jeffrey Tri, Le Quang Minh Willi, Michaela Mansour, Tamer Kai, Yan Garner, John Lopez, John Busby, Ben F1000Res Software Tool Article dbVar houses over 3 million submitted structural variants (SSV) from 120 human studies including copy number variations (CNV), insertions, deletions, inversions, translocations, and complex chromosomal rearrangements. Users can submit multiple SSVs to dbVAR that are presumably identical, but were ascertained by different platforms and samples, to calculate whether the variant is rare or common in the population and allow for cross validation. However, because SSV genomic location reporting can vary – including fuzzy locations where the start and/or end points are not precisely known – analysis, comparison, annotation, and reporting of SSVs across studies can be difficult. This project was initiated by the Structural Variant Comparison Group for the purpose of generating a non-redundant set of genomic regions defined by counts of concordance for all human SSVs placed on RefSeq assembly GRCh38 (RefSeq accession GCF_000001405.26). We intend that the availability of these regions, called structural variant clusters (SVCs), will facilitate the analysis, annotation, and exchange of SV data and allow for simplified display in genomic sequence viewers for improved variant interpretation. Sets of SVCs were generated by variant type for each of the 120 studies as well as for a combined set across all studies. Starting from 3.64 million SSVs, 2.5 million and 3.4 million non-redundant SVCs with count >=1 were generated by variant type for each study and across all studies, respectively. In addition, we have developed utilities for annotating, searching, and filtering SVC data in GVF format for computing summary statistics, exporting data for genomic viewers, and annotating the SVC using external data sources. F1000Research 2017-02-28 /pmc/articles/PMC5345777/ /pubmed/28357035 http://dx.doi.org/10.12688/f1000research.8290.2 Text en Copyright: © 2017 Phan L et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The author(s) is/are employees of the US Government and therefore domestic copyright protection in USA does not apply to this work. The work may be protected under the copyright laws of other jurisdictions when used in those jurisdictions. |
spellingShingle | Software Tool Article Phan, Lon Hsu, Jeffrey Tri, Le Quang Minh Willi, Michaela Mansour, Tamer Kai, Yan Garner, John Lopez, John Busby, Ben dbVar structural variant cluster set for data analysis and variant comparison |
title | dbVar structural variant cluster set for data analysis and variant comparison |
title_full | dbVar structural variant cluster set for data analysis and variant comparison |
title_fullStr | dbVar structural variant cluster set for data analysis and variant comparison |
title_full_unstemmed | dbVar structural variant cluster set for data analysis and variant comparison |
title_short | dbVar structural variant cluster set for data analysis and variant comparison |
title_sort | dbvar structural variant cluster set for data analysis and variant comparison |
topic | Software Tool Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5345777/ https://www.ncbi.nlm.nih.gov/pubmed/28357035 http://dx.doi.org/10.12688/f1000research.8290.2 |
work_keys_str_mv | AT phanlon dbvarstructuralvariantclustersetfordataanalysisandvariantcomparison AT hsujeffrey dbvarstructuralvariantclustersetfordataanalysisandvariantcomparison AT trilequangminh dbvarstructuralvariantclustersetfordataanalysisandvariantcomparison AT willimichaela dbvarstructuralvariantclustersetfordataanalysisandvariantcomparison AT mansourtamer dbvarstructuralvariantclustersetfordataanalysisandvariantcomparison AT kaiyan dbvarstructuralvariantclustersetfordataanalysisandvariantcomparison AT garnerjohn dbvarstructuralvariantclustersetfordataanalysisandvariantcomparison AT lopezjohn dbvarstructuralvariantclustersetfordataanalysisandvariantcomparison AT busbyben dbvarstructuralvariantclustersetfordataanalysisandvariantcomparison |