Cargando…

VarSCAT: A computational tool for sequence context annotations of genomic variants

The sequence contexts of genomic variants play important roles in understanding biological significances of variants and potential sequencing related variant calling issues. However, methods for assessing the diverse sequence contexts of genomic variants such as tandem repeats and unambiguous annota...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Ning, Khan, Sofia, Elo, Laura L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10446208/
https://www.ncbi.nlm.nih.gov/pubmed/37566612
http://dx.doi.org/10.1371/journal.pcbi.1010727
_version_ 1785094354256265216
author Wang, Ning
Khan, Sofia
Elo, Laura L.
author_facet Wang, Ning
Khan, Sofia
Elo, Laura L.
author_sort Wang, Ning
collection PubMed
description The sequence contexts of genomic variants play important roles in understanding biological significances of variants and potential sequencing related variant calling issues. However, methods for assessing the diverse sequence contexts of genomic variants such as tandem repeats and unambiguous annotations have been limited. Herein, we describe the Variant Sequence Context Annotation Tool (VarSCAT) for annotating the sequence contexts of genomic variants, including breakpoint ambiguities, flanking bases of variants, wildtype/mutated DNA sequences, variant nomenclatures, distances between adjacent variants, tandem repeat regions, and custom annotation with user customizable options. Our analyses demonstrate that VarSCAT is more versatile and customizable than the currently available methods or strategies for annotating variants in short tandem repeat (STR) regions or insertions and deletions (indels) with breakpoint ambiguity. Variant sequence context annotations of high-confidence human variant sets with VarSCAT revealed that more than 75% of all human individual germline and clinically relevant indels have breakpoint ambiguities. Moreover, we illustrate that more than 80% of human individual germline small variants in STR regions are indels and that the sizes of these indels correlated with STR motif sizes. VarSCAT is available from https://github.com/elolab/VarSCAT.
format Online
Article
Text
id pubmed-10446208
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-104462082023-08-24 VarSCAT: A computational tool for sequence context annotations of genomic variants Wang, Ning Khan, Sofia Elo, Laura L. PLoS Comput Biol Research Article The sequence contexts of genomic variants play important roles in understanding biological significances of variants and potential sequencing related variant calling issues. However, methods for assessing the diverse sequence contexts of genomic variants such as tandem repeats and unambiguous annotations have been limited. Herein, we describe the Variant Sequence Context Annotation Tool (VarSCAT) for annotating the sequence contexts of genomic variants, including breakpoint ambiguities, flanking bases of variants, wildtype/mutated DNA sequences, variant nomenclatures, distances between adjacent variants, tandem repeat regions, and custom annotation with user customizable options. Our analyses demonstrate that VarSCAT is more versatile and customizable than the currently available methods or strategies for annotating variants in short tandem repeat (STR) regions or insertions and deletions (indels) with breakpoint ambiguity. Variant sequence context annotations of high-confidence human variant sets with VarSCAT revealed that more than 75% of all human individual germline and clinically relevant indels have breakpoint ambiguities. Moreover, we illustrate that more than 80% of human individual germline small variants in STR regions are indels and that the sizes of these indels correlated with STR motif sizes. VarSCAT is available from https://github.com/elolab/VarSCAT. Public Library of Science 2023-08-11 /pmc/articles/PMC10446208/ /pubmed/37566612 http://dx.doi.org/10.1371/journal.pcbi.1010727 Text en © 2023 Wang et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Wang, Ning
Khan, Sofia
Elo, Laura L.
VarSCAT: A computational tool for sequence context annotations of genomic variants
title VarSCAT: A computational tool for sequence context annotations of genomic variants
title_full VarSCAT: A computational tool for sequence context annotations of genomic variants
title_fullStr VarSCAT: A computational tool for sequence context annotations of genomic variants
title_full_unstemmed VarSCAT: A computational tool for sequence context annotations of genomic variants
title_short VarSCAT: A computational tool for sequence context annotations of genomic variants
title_sort varscat: a computational tool for sequence context annotations of genomic variants
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10446208/
https://www.ncbi.nlm.nih.gov/pubmed/37566612
http://dx.doi.org/10.1371/journal.pcbi.1010727
work_keys_str_mv AT wangning varscatacomputationaltoolforsequencecontextannotationsofgenomicvariants
AT khansofia varscatacomputationaltoolforsequencecontextannotationsofgenomicvariants
AT elolaural varscatacomputationaltoolforsequencecontextannotationsofgenomicvariants