Cargando…

dbVar structural variant cluster set for data analysis and variant comparison

dbVar houses over 3 million submitted structural variants (SSV) from 120 human studies including copy number variations (CNV), insertions, deletions, inversions, translocations, and complex chromosomal rearrangements. Users can submit multiple SSVs to dbVAR  that are presumably identical, but were a...

Descripción completa

Detalles Bibliográficos
Autores principales: Phan, Lon, Hsu, Jeffrey, Tri, Le Quang Minh, Willi, Michaela, Mansour, Tamer, Kai, Yan, Garner, John, Lopez, John, Busby, Ben
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000Research 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5345777/
https://www.ncbi.nlm.nih.gov/pubmed/28357035
http://dx.doi.org/10.12688/f1000research.8290.2
_version_ 1782513778861015040
author Phan, Lon
Hsu, Jeffrey
Tri, Le Quang Minh
Willi, Michaela
Mansour, Tamer
Kai, Yan
Garner, John
Lopez, John
Busby, Ben
author_facet Phan, Lon
Hsu, Jeffrey
Tri, Le Quang Minh
Willi, Michaela
Mansour, Tamer
Kai, Yan
Garner, John
Lopez, John
Busby, Ben
author_sort Phan, Lon
collection PubMed
description dbVar houses over 3 million submitted structural variants (SSV) from 120 human studies including copy number variations (CNV), insertions, deletions, inversions, translocations, and complex chromosomal rearrangements. Users can submit multiple SSVs to dbVAR  that are presumably identical, but were ascertained by different platforms and samples,  to calculate whether the variant is rare or common in the population and allow for cross validation. However, because SSV genomic location reporting can vary – including fuzzy locations where the start and/or end points are not precisely known – analysis, comparison, annotation, and reporting of SSVs across studies can be difficult. This project was initiated by the Structural Variant Comparison Group for the purpose of generating a non-redundant set of genomic regions defined by counts of concordance for all human SSVs placed on RefSeq assembly GRCh38 (RefSeq accession GCF_000001405.26). We intend that the availability of these regions, called structural variant clusters (SVCs), will facilitate the analysis, annotation, and exchange of SV data and allow for simplified display in genomic sequence viewers for improved variant interpretation. Sets of SVCs were generated by variant type for each of the 120 studies as well as for a combined set across all studies. Starting from 3.64 million SSVs, 2.5 million and 3.4 million non-redundant SVCs with count >=1 were generated by variant type for each study and across all studies, respectively. In addition, we have developed utilities for annotating, searching, and filtering SVC data in GVF format for computing summary statistics, exporting data for genomic viewers, and annotating the SVC using external data sources.
format Online
Article
Text
id pubmed-5345777
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher F1000Research
record_format MEDLINE/PubMed
spelling pubmed-53457772017-03-28 dbVar structural variant cluster set for data analysis and variant comparison Phan, Lon Hsu, Jeffrey Tri, Le Quang Minh Willi, Michaela Mansour, Tamer Kai, Yan Garner, John Lopez, John Busby, Ben F1000Res Software Tool Article dbVar houses over 3 million submitted structural variants (SSV) from 120 human studies including copy number variations (CNV), insertions, deletions, inversions, translocations, and complex chromosomal rearrangements. Users can submit multiple SSVs to dbVAR  that are presumably identical, but were ascertained by different platforms and samples,  to calculate whether the variant is rare or common in the population and allow for cross validation. However, because SSV genomic location reporting can vary – including fuzzy locations where the start and/or end points are not precisely known – analysis, comparison, annotation, and reporting of SSVs across studies can be difficult. This project was initiated by the Structural Variant Comparison Group for the purpose of generating a non-redundant set of genomic regions defined by counts of concordance for all human SSVs placed on RefSeq assembly GRCh38 (RefSeq accession GCF_000001405.26). We intend that the availability of these regions, called structural variant clusters (SVCs), will facilitate the analysis, annotation, and exchange of SV data and allow for simplified display in genomic sequence viewers for improved variant interpretation. Sets of SVCs were generated by variant type for each of the 120 studies as well as for a combined set across all studies. Starting from 3.64 million SSVs, 2.5 million and 3.4 million non-redundant SVCs with count >=1 were generated by variant type for each study and across all studies, respectively. In addition, we have developed utilities for annotating, searching, and filtering SVC data in GVF format for computing summary statistics, exporting data for genomic viewers, and annotating the SVC using external data sources. F1000Research 2017-02-28 /pmc/articles/PMC5345777/ /pubmed/28357035 http://dx.doi.org/10.12688/f1000research.8290.2 Text en Copyright: © 2017 Phan L et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The author(s) is/are employees of the US Government and therefore domestic copyright protection in USA does not apply to this work. The work may be protected under the copyright laws of other jurisdictions when used in those jurisdictions.
spellingShingle Software Tool Article
Phan, Lon
Hsu, Jeffrey
Tri, Le Quang Minh
Willi, Michaela
Mansour, Tamer
Kai, Yan
Garner, John
Lopez, John
Busby, Ben
dbVar structural variant cluster set for data analysis and variant comparison
title dbVar structural variant cluster set for data analysis and variant comparison
title_full dbVar structural variant cluster set for data analysis and variant comparison
title_fullStr dbVar structural variant cluster set for data analysis and variant comparison
title_full_unstemmed dbVar structural variant cluster set for data analysis and variant comparison
title_short dbVar structural variant cluster set for data analysis and variant comparison
title_sort dbvar structural variant cluster set for data analysis and variant comparison
topic Software Tool Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5345777/
https://www.ncbi.nlm.nih.gov/pubmed/28357035
http://dx.doi.org/10.12688/f1000research.8290.2
work_keys_str_mv AT phanlon dbvarstructuralvariantclustersetfordataanalysisandvariantcomparison
AT hsujeffrey dbvarstructuralvariantclustersetfordataanalysisandvariantcomparison
AT trilequangminh dbvarstructuralvariantclustersetfordataanalysisandvariantcomparison
AT willimichaela dbvarstructuralvariantclustersetfordataanalysisandvariantcomparison
AT mansourtamer dbvarstructuralvariantclustersetfordataanalysisandvariantcomparison
AT kaiyan dbvarstructuralvariantclustersetfordataanalysisandvariantcomparison
AT garnerjohn dbvarstructuralvariantclustersetfordataanalysisandvariantcomparison
AT lopezjohn dbvarstructuralvariantclustersetfordataanalysisandvariantcomparison
AT busbyben dbvarstructuralvariantclustersetfordataanalysisandvariantcomparison