Cargando…

PileLine: a toolbox to handle genome position information in next-generation sequencing studies

BACKGROUND: Genomic position (GP) files currently used in next-generation sequencing (NGS) studies are always difficult to manipulate due to their huge size and the lack of appropriate tools to properly manage them. The structure of these flat files is based on representing one line per position tha...

Descripción completa

Detalles Bibliográficos
Autores principales: Glez-Peña, Daniel, Gómez-López, Gonzalo, Reboiro-Jato, Miguel, Fdez-Riverola, Florentino, Pisano, David G
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3037855/
https://www.ncbi.nlm.nih.gov/pubmed/21261974
http://dx.doi.org/10.1186/1471-2105-12-31
_version_ 1782198018798256128
author Glez-Peña, Daniel
Gómez-López, Gonzalo
Reboiro-Jato, Miguel
Fdez-Riverola, Florentino
Pisano, David G
author_facet Glez-Peña, Daniel
Gómez-López, Gonzalo
Reboiro-Jato, Miguel
Fdez-Riverola, Florentino
Pisano, David G
author_sort Glez-Peña, Daniel
collection PubMed
description BACKGROUND: Genomic position (GP) files currently used in next-generation sequencing (NGS) studies are always difficult to manipulate due to their huge size and the lack of appropriate tools to properly manage them. The structure of these flat files is based on representing one line per position that has been covered by at least one aligned read, imposing significant restrictions from a computational performance perspective. RESULTS: PileLine implements a flexible command-line toolkit providing specific support to the management, filtering, comparison and annotation of GP files produced by NGS experiments. PileLine tools are coded in Java and run on both UNIX (Linux, Mac OS) and Windows platforms. The set of tools comprising PileLine are designed to be memory efficient by performing fast seek on-disk operations over sorted GP files. CONCLUSIONS: Our novel toolbox has been extensively tested taking into consideration performance issues. It is publicly available at http://sourceforge.net/projects/pilelinetools under the GNU LGPL license. Full documentation including common use cases and guided analysis workflows is available at http://sing.ei.uvigo.es/pileline.
format Text
id pubmed-3037855
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30378552011-02-12 PileLine: a toolbox to handle genome position information in next-generation sequencing studies Glez-Peña, Daniel Gómez-López, Gonzalo Reboiro-Jato, Miguel Fdez-Riverola, Florentino Pisano, David G BMC Bioinformatics Software BACKGROUND: Genomic position (GP) files currently used in next-generation sequencing (NGS) studies are always difficult to manipulate due to their huge size and the lack of appropriate tools to properly manage them. The structure of these flat files is based on representing one line per position that has been covered by at least one aligned read, imposing significant restrictions from a computational performance perspective. RESULTS: PileLine implements a flexible command-line toolkit providing specific support to the management, filtering, comparison and annotation of GP files produced by NGS experiments. PileLine tools are coded in Java and run on both UNIX (Linux, Mac OS) and Windows platforms. The set of tools comprising PileLine are designed to be memory efficient by performing fast seek on-disk operations over sorted GP files. CONCLUSIONS: Our novel toolbox has been extensively tested taking into consideration performance issues. It is publicly available at http://sourceforge.net/projects/pilelinetools under the GNU LGPL license. Full documentation including common use cases and guided analysis workflows is available at http://sing.ei.uvigo.es/pileline. BioMed Central 2011-01-24 /pmc/articles/PMC3037855/ /pubmed/21261974 http://dx.doi.org/10.1186/1471-2105-12-31 Text en Copyright © 2011 Glez-Peña et al; licensee BioMed Central Ltd. https://creativecommons.org/licenses/by/2.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0 (https://creativecommons.org/licenses/by/2.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Glez-Peña, Daniel
Gómez-López, Gonzalo
Reboiro-Jato, Miguel
Fdez-Riverola, Florentino
Pisano, David G
PileLine: a toolbox to handle genome position information in next-generation sequencing studies
title PileLine: a toolbox to handle genome position information in next-generation sequencing studies
title_full PileLine: a toolbox to handle genome position information in next-generation sequencing studies
title_fullStr PileLine: a toolbox to handle genome position information in next-generation sequencing studies
title_full_unstemmed PileLine: a toolbox to handle genome position information in next-generation sequencing studies
title_short PileLine: a toolbox to handle genome position information in next-generation sequencing studies
title_sort pileline: a toolbox to handle genome position information in next-generation sequencing studies
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3037855/
https://www.ncbi.nlm.nih.gov/pubmed/21261974
http://dx.doi.org/10.1186/1471-2105-12-31
work_keys_str_mv AT glezpenadaniel pilelineatoolboxtohandlegenomepositioninformationinnextgenerationsequencingstudies
AT gomezlopezgonzalo pilelineatoolboxtohandlegenomepositioninformationinnextgenerationsequencingstudies
AT reboirojatomiguel pilelineatoolboxtohandlegenomepositioninformationinnextgenerationsequencingstudies
AT fdezriverolaflorentino pilelineatoolboxtohandlegenomepositioninformationinnextgenerationsequencingstudies
AT pisanodavidg pilelineatoolboxtohandlegenomepositioninformationinnextgenerationsequencingstudies