Cargando…
SeqTailor: a user-friendly webserver for the extraction of DNA or protein sequences from next-generation sequencing data
Human whole-genome-sequencing reveals about 4 000 000 genomic variants per individual. These data are mostly stored as VCF-format files. Although many variant analysis methods accept VCF as input, many other tools require DNA or protein sequences, particularly for splicing prediction, sequence align...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6602489/ https://www.ncbi.nlm.nih.gov/pubmed/31045209 http://dx.doi.org/10.1093/nar/gkz326 |
_version_ | 1783431387930427392 |
---|---|
author | Zhang, Peng Boisson, Bertrand Stenson, Peter D Cooper, David N Casanova, Jean-Laurent Abel, Laurent Itan, Yuval |
author_facet | Zhang, Peng Boisson, Bertrand Stenson, Peter D Cooper, David N Casanova, Jean-Laurent Abel, Laurent Itan, Yuval |
author_sort | Zhang, Peng |
collection | PubMed |
description | Human whole-genome-sequencing reveals about 4 000 000 genomic variants per individual. These data are mostly stored as VCF-format files. Although many variant analysis methods accept VCF as input, many other tools require DNA or protein sequences, particularly for splicing prediction, sequence alignment, phylogenetic analysis, and structure prediction. However, there is no existing webserver capable of extracting DNA/protein sequences for genomic variants from VCF files in a user-friendly and efficient manner. We developed the SeqTailor webserver to bridge this gap, by enabling rapid extraction of (i) DNA sequences around genomic variants, with customizable window sizes and options to annotate the splice sites closest to the variants and to consider the neighboring variants within the window; and (ii) protein sequences encoded by the DNA sequences around genomic variants, with built-in SnpEff annotator and customizable window sizes. SeqTailor supports 11 species, including: human (GRCh37/GRCh38), chimpanzee, mouse, rat, cow, chicken, lizard, zebrafish, fruitfly, Arabidopsis and rice. Standalone programs are provided for command-line-based needs. SeqTailor streamlines the sequence extraction process, and accelerates the analysis of genomic variants with software requiring DNA/protein sequences. It will facilitate the study of genomic variation, by increasing the feasibility of sequence-based analysis and prediction. The SeqTailor webserver is freely available at http://shiva.rockefeller.edu/SeqTailor/. |
format | Online Article Text |
id | pubmed-6602489 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-66024892019-07-05 SeqTailor: a user-friendly webserver for the extraction of DNA or protein sequences from next-generation sequencing data Zhang, Peng Boisson, Bertrand Stenson, Peter D Cooper, David N Casanova, Jean-Laurent Abel, Laurent Itan, Yuval Nucleic Acids Res Web Server Issue Human whole-genome-sequencing reveals about 4 000 000 genomic variants per individual. These data are mostly stored as VCF-format files. Although many variant analysis methods accept VCF as input, many other tools require DNA or protein sequences, particularly for splicing prediction, sequence alignment, phylogenetic analysis, and structure prediction. However, there is no existing webserver capable of extracting DNA/protein sequences for genomic variants from VCF files in a user-friendly and efficient manner. We developed the SeqTailor webserver to bridge this gap, by enabling rapid extraction of (i) DNA sequences around genomic variants, with customizable window sizes and options to annotate the splice sites closest to the variants and to consider the neighboring variants within the window; and (ii) protein sequences encoded by the DNA sequences around genomic variants, with built-in SnpEff annotator and customizable window sizes. SeqTailor supports 11 species, including: human (GRCh37/GRCh38), chimpanzee, mouse, rat, cow, chicken, lizard, zebrafish, fruitfly, Arabidopsis and rice. Standalone programs are provided for command-line-based needs. SeqTailor streamlines the sequence extraction process, and accelerates the analysis of genomic variants with software requiring DNA/protein sequences. It will facilitate the study of genomic variation, by increasing the feasibility of sequence-based analysis and prediction. The SeqTailor webserver is freely available at http://shiva.rockefeller.edu/SeqTailor/. Oxford University Press 2019-07-02 2019-05-02 /pmc/articles/PMC6602489/ /pubmed/31045209 http://dx.doi.org/10.1093/nar/gkz326 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Web Server Issue Zhang, Peng Boisson, Bertrand Stenson, Peter D Cooper, David N Casanova, Jean-Laurent Abel, Laurent Itan, Yuval SeqTailor: a user-friendly webserver for the extraction of DNA or protein sequences from next-generation sequencing data |
title | SeqTailor: a user-friendly webserver for the extraction of DNA or protein sequences from next-generation sequencing data |
title_full | SeqTailor: a user-friendly webserver for the extraction of DNA or protein sequences from next-generation sequencing data |
title_fullStr | SeqTailor: a user-friendly webserver for the extraction of DNA or protein sequences from next-generation sequencing data |
title_full_unstemmed | SeqTailor: a user-friendly webserver for the extraction of DNA or protein sequences from next-generation sequencing data |
title_short | SeqTailor: a user-friendly webserver for the extraction of DNA or protein sequences from next-generation sequencing data |
title_sort | seqtailor: a user-friendly webserver for the extraction of dna or protein sequences from next-generation sequencing data |
topic | Web Server Issue |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6602489/ https://www.ncbi.nlm.nih.gov/pubmed/31045209 http://dx.doi.org/10.1093/nar/gkz326 |
work_keys_str_mv | AT zhangpeng seqtailorauserfriendlywebserverfortheextractionofdnaorproteinsequencesfromnextgenerationsequencingdata AT boissonbertrand seqtailorauserfriendlywebserverfortheextractionofdnaorproteinsequencesfromnextgenerationsequencingdata AT stensonpeterd seqtailorauserfriendlywebserverfortheextractionofdnaorproteinsequencesfromnextgenerationsequencingdata AT cooperdavidn seqtailorauserfriendlywebserverfortheextractionofdnaorproteinsequencesfromnextgenerationsequencingdata AT casanovajeanlaurent seqtailorauserfriendlywebserverfortheextractionofdnaorproteinsequencesfromnextgenerationsequencingdata AT abellaurent seqtailorauserfriendlywebserverfortheextractionofdnaorproteinsequencesfromnextgenerationsequencingdata AT itanyuval seqtailorauserfriendlywebserverfortheextractionofdnaorproteinsequencesfromnextgenerationsequencingdata |