Cargando…

PathoQC: Computationally Efficient Read Preprocessing and Quality Control for High-Throughput Sequencing Data Sets

Quality control and read preprocessing are critical steps in the analysis of data sets generated from high-throughput genomic screens. In the most extreme cases, improper preprocessing can negatively affect downstream analyses and may lead to incorrect biological conclusions. Here, we present PathoQ...

Descripción completa

Detalles Bibliográficos
Autores principales:	Hong, Changjin, Manimaran, Solaiappan, Johnson, William Evan
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Libertas Academica 2015
Materias:	Software or Database Review
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4429651/ https://www.ncbi.nlm.nih.gov/pubmed/25983538 http://dx.doi.org/10.4137/CIN.S13890

_version_	1782371073699872768
author	Hong, Changjin Manimaran, Solaiappan Johnson, William Evan
author_facet	Hong, Changjin Manimaran, Solaiappan Johnson, William Evan
author_sort	Hong, Changjin
collection	PubMed
description	Quality control and read preprocessing are critical steps in the analysis of data sets generated from high-throughput genomic screens. In the most extreme cases, improper preprocessing can negatively affect downstream analyses and may lead to incorrect biological conclusions. Here, we present PathoQC, a streamlined toolkit that seamlessly combines the benefits of several popular quality control software approaches for preprocessing next-generation sequencing data. PathoQC provides a variety of quality control options appropriate for most high-throughput sequencing applications. PathoQC is primarily developed as a module in the PathoScope software suite for metagenomic analysis. However, PathoQC is also available as an open-source Python module that can run as a stand-alone application or can be easily integrated into any bioinformatics workflow. PathoQC achieves high performance by supporting parallel computation and is an effective tool that removes technical sequencing artifacts and facilitates robust downstream analysis. The PathoQC software package is available at http://sourceforge.net/projects/PathoScope/.
format	Online Article Text
id	pubmed-4429651
institution	National Center for Biotechnology Information
language	English
publishDate	2015
publisher	Libertas Academica
record_format	MEDLINE/PubMed
spelling	pubmed-44296512015-05-15 PathoQC: Computationally Efficient Read Preprocessing and Quality Control for High-Throughput Sequencing Data Sets Hong, Changjin Manimaran, Solaiappan Johnson, William Evan Cancer Inform Software or Database Review Quality control and read preprocessing are critical steps in the analysis of data sets generated from high-throughput genomic screens. In the most extreme cases, improper preprocessing can negatively affect downstream analyses and may lead to incorrect biological conclusions. Here, we present PathoQC, a streamlined toolkit that seamlessly combines the benefits of several popular quality control software approaches for preprocessing next-generation sequencing data. PathoQC provides a variety of quality control options appropriate for most high-throughput sequencing applications. PathoQC is primarily developed as a module in the PathoScope software suite for metagenomic analysis. However, PathoQC is also available as an open-source Python module that can run as a stand-alone application or can be easily integrated into any bioinformatics workflow. PathoQC achieves high performance by supporting parallel computation and is an effective tool that removes technical sequencing artifacts and facilitates robust downstream analysis. The PathoQC software package is available at http://sourceforge.net/projects/PathoScope/. Libertas Academica 2015-05-12 /pmc/articles/PMC4429651/ /pubmed/25983538 http://dx.doi.org/10.4137/CIN.S13890 Text en © 2014 the author(s), publisher and licensee Libertas Academica Limited This is an open-access article distributed under the terms of the Creative Commons CCCC-BY-NCNC 3.0 License.
spellingShingle	Software or Database Review Hong, Changjin Manimaran, Solaiappan Johnson, William Evan PathoQC: Computationally Efficient Read Preprocessing and Quality Control for High-Throughput Sequencing Data Sets
title	PathoQC: Computationally Efficient Read Preprocessing and Quality Control for High-Throughput Sequencing Data Sets
title_full	PathoQC: Computationally Efficient Read Preprocessing and Quality Control for High-Throughput Sequencing Data Sets
title_fullStr	PathoQC: Computationally Efficient Read Preprocessing and Quality Control for High-Throughput Sequencing Data Sets
title_full_unstemmed	PathoQC: Computationally Efficient Read Preprocessing and Quality Control for High-Throughput Sequencing Data Sets
title_short	PathoQC: Computationally Efficient Read Preprocessing and Quality Control for High-Throughput Sequencing Data Sets
title_sort	pathoqc: computationally efficient read preprocessing and quality control for high-throughput sequencing data sets
topic	Software or Database Review
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4429651/ https://www.ncbi.nlm.nih.gov/pubmed/25983538 http://dx.doi.org/10.4137/CIN.S13890
work_keys_str_mv	AT hongchangjin pathoqccomputationallyefficientreadpreprocessingandqualitycontrolforhighthroughputsequencingdatasets AT manimaransolaiappan pathoqccomputationallyefficientreadpreprocessingandqualitycontrolforhighthroughputsequencingdatasets AT johnsonwilliamevan pathoqccomputationallyefficientreadpreprocessingandqualitycontrolforhighthroughputsequencingdatasets

PathoQC: Computationally Efficient Read Preprocessing and Quality Control for High-Throughput Sequencing Data Sets

Ejemplares similares