Cargando…

Pygenomics: manipulating genomic intervals and data files in Python

SUMMARY: We present pygenomics, a Python package for working with genomic intervals and bioinformatic data files. The package implements interval operations, provides both API and CLI, and supports reading and writing data in widely used bioinformatic formats, including BAM, BED, GFF3, and VCF. The...

Descripción completa

Detalles Bibliográficos
Autores principales: Tamazian, Gaik, Cherkasov, Nikolay, Kanapin, Alexander, Samsonova, Anastasia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10246576/
https://www.ncbi.nlm.nih.gov/pubmed/37228014
http://dx.doi.org/10.1093/bioinformatics/btad346
Descripción
Sumario:SUMMARY: We present pygenomics, a Python package for working with genomic intervals and bioinformatic data files. The package implements interval operations, provides both API and CLI, and supports reading and writing data in widely used bioinformatic formats, including BAM, BED, GFF3, and VCF. The source code of pygenomics is provided with in-source documentation and type annotations and adheres to the functional programming paradigm. These features facilitate seamless integration of pygenomics routines into scripts and pipelines. The package is implemented in pure Python using its standard library only and contains the property-based testing framework. Comparison of pygenomics with other Python bioinformatic packages with relation to features and performance is presented. The performance comparison covers operations with genomic intervals, read alignments, and genomic variants and demonstrates that pygenomics is suitable for computationally effective analysis. AVAILABILITY AND IMPLEMENTATION: The source code is available at https://gitlab.com/gtamazian/pygenomics.