Cargando…
HTSeq—a Python framework to work with high-throughput sequencing data
Motivation: A large choice of tools exists for many standard tasks in the analysis of high-throughput sequencing (HTS) data. However, once a project deviates from standard workflows, custom scripts are needed. Results: We present HTSeq, a Python library to facilitate the rapid development of such sc...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4287950/ https://www.ncbi.nlm.nih.gov/pubmed/25260700 http://dx.doi.org/10.1093/bioinformatics/btu638 |
_version_ | 1782351888196304896 |
---|---|
author | Anders, Simon Pyl, Paul Theodor Huber, Wolfgang |
author_facet | Anders, Simon Pyl, Paul Theodor Huber, Wolfgang |
author_sort | Anders, Simon |
collection | PubMed |
description | Motivation: A large choice of tools exists for many standard tasks in the analysis of high-throughput sequencing (HTS) data. However, once a project deviates from standard workflows, custom scripts are needed. Results: We present HTSeq, a Python library to facilitate the rapid development of such scripts. HTSeq offers parsers for many common data formats in HTS projects, as well as classes to represent data, such as genomic coordinates, sequences, sequencing reads, alignments, gene model information and variant calls, and provides data structures that allow for querying via genomic coordinates. We also present htseq-count, a tool developed with HTSeq that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes. Availability and implementation: HTSeq is released as an open-source software under the GNU General Public Licence and available from http://www-huber.embl.de/HTSeq or from the Python Package Index at https://pypi.python.org/pypi/HTSeq. Contact: sanders@fs.tum.de |
format | Online Article Text |
id | pubmed-4287950 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-42879502015-01-30 HTSeq—a Python framework to work with high-throughput sequencing data Anders, Simon Pyl, Paul Theodor Huber, Wolfgang Bioinformatics Original Papers Motivation: A large choice of tools exists for many standard tasks in the analysis of high-throughput sequencing (HTS) data. However, once a project deviates from standard workflows, custom scripts are needed. Results: We present HTSeq, a Python library to facilitate the rapid development of such scripts. HTSeq offers parsers for many common data formats in HTS projects, as well as classes to represent data, such as genomic coordinates, sequences, sequencing reads, alignments, gene model information and variant calls, and provides data structures that allow for querying via genomic coordinates. We also present htseq-count, a tool developed with HTSeq that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes. Availability and implementation: HTSeq is released as an open-source software under the GNU General Public Licence and available from http://www-huber.embl.de/HTSeq or from the Python Package Index at https://pypi.python.org/pypi/HTSeq. Contact: sanders@fs.tum.de Oxford University Press 2015-01-15 2014-09-25 /pmc/articles/PMC4287950/ /pubmed/25260700 http://dx.doi.org/10.1093/bioinformatics/btu638 Text en © The Author 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Papers Anders, Simon Pyl, Paul Theodor Huber, Wolfgang HTSeq—a Python framework to work with high-throughput sequencing data |
title | HTSeq—a Python framework to work with high-throughput sequencing data |
title_full | HTSeq—a Python framework to work with high-throughput sequencing data |
title_fullStr | HTSeq—a Python framework to work with high-throughput sequencing data |
title_full_unstemmed | HTSeq—a Python framework to work with high-throughput sequencing data |
title_short | HTSeq—a Python framework to work with high-throughput sequencing data |
title_sort | htseq—a python framework to work with high-throughput sequencing data |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4287950/ https://www.ncbi.nlm.nih.gov/pubmed/25260700 http://dx.doi.org/10.1093/bioinformatics/btu638 |
work_keys_str_mv | AT anderssimon htseqapythonframeworktoworkwithhighthroughputsequencingdata AT pylpaultheodor htseqapythonframeworktoworkwithhighthroughputsequencingdata AT huberwolfgang htseqapythonframeworktoworkwithhighthroughputsequencingdata |