Cargando…

Meta-QC-Chain: Comprehensive and Fast Quality Control Method for Metagenomic Data

Next-generation sequencing (NGS) technology has revolutionized and significantly impacted metagenomic research. However, the NGS data usually contains sequencing artifacts such as low-quality reads and contaminating reads, which will significantly compromise downstream analysis. Many quality control...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhou, Qian, Su, Xiaoquan, Jing, Gongchao, Ning, Kang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4411374/
https://www.ncbi.nlm.nih.gov/pubmed/24508279
http://dx.doi.org/10.1016/j.gpb.2014.01.002
_version_ 1782368462228684800
author Zhou, Qian
Su, Xiaoquan
Jing, Gongchao
Ning, Kang
author_facet Zhou, Qian
Su, Xiaoquan
Jing, Gongchao
Ning, Kang
author_sort Zhou, Qian
collection PubMed
description Next-generation sequencing (NGS) technology has revolutionized and significantly impacted metagenomic research. However, the NGS data usually contains sequencing artifacts such as low-quality reads and contaminating reads, which will significantly compromise downstream analysis. Many quality control (QC) tools have been proposed, however, few of them have been verified to be suitable or efficient for metagenomic data, which are composed of multiple genomes and are more complex than other kinds of NGS data. Here we present a metagenomic data QC method named Meta-QC-Chain. Meta-QC-Chain combines multiple QC functions: technical tests describe input data status and identify potential errors, quality trimming filters poor sequencing-quality bases and reads, and contamination screening identifies higher eukaryotic species, which are considered as contamination for metagenomic data. Most computing processes are optimized based on parallel programming. Testing on an 8-GB real dataset showed that Meta-QC-Chain trimmed low sequencing-quality reads and contaminating reads, and the whole quality control procedure was completed within 20 min. Therefore, Meta-QC-Chain provides a comprehensive, useful and high-performance QC tool for metagenomic data. Meta-QC-Chain is publicly available for free at: http://computationalbioenergy.org/meta-qc-chain.html.
format Online
Article
Text
id pubmed-4411374
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-44113742015-05-06 Meta-QC-Chain: Comprehensive and Fast Quality Control Method for Metagenomic Data Zhou, Qian Su, Xiaoquan Jing, Gongchao Ning, Kang Genomics Proteomics Bioinformatics Application Note Next-generation sequencing (NGS) technology has revolutionized and significantly impacted metagenomic research. However, the NGS data usually contains sequencing artifacts such as low-quality reads and contaminating reads, which will significantly compromise downstream analysis. Many quality control (QC) tools have been proposed, however, few of them have been verified to be suitable or efficient for metagenomic data, which are composed of multiple genomes and are more complex than other kinds of NGS data. Here we present a metagenomic data QC method named Meta-QC-Chain. Meta-QC-Chain combines multiple QC functions: technical tests describe input data status and identify potential errors, quality trimming filters poor sequencing-quality bases and reads, and contamination screening identifies higher eukaryotic species, which are considered as contamination for metagenomic data. Most computing processes are optimized based on parallel programming. Testing on an 8-GB real dataset showed that Meta-QC-Chain trimmed low sequencing-quality reads and contaminating reads, and the whole quality control procedure was completed within 20 min. Therefore, Meta-QC-Chain provides a comprehensive, useful and high-performance QC tool for metagenomic data. Meta-QC-Chain is publicly available for free at: http://computationalbioenergy.org/meta-qc-chain.html. Elsevier 2014-02 2014-02-04 /pmc/articles/PMC4411374/ /pubmed/24508279 http://dx.doi.org/10.1016/j.gpb.2014.01.002 Text en © 2014 Beijing Institute of Genomics, Chinese Academy of Sciences and Genetics Society of China. Production and hosting by Elsevier B.V. All rights reserved. http://creativecommons.org/licenses/by-nc-sa/3.0/ This is an open access article under the CC BY-NC-SA license (http://creativecommons.org/licenses/by-nc-sa/3.0/).
spellingShingle Application Note
Zhou, Qian
Su, Xiaoquan
Jing, Gongchao
Ning, Kang
Meta-QC-Chain: Comprehensive and Fast Quality Control Method for Metagenomic Data
title Meta-QC-Chain: Comprehensive and Fast Quality Control Method for Metagenomic Data
title_full Meta-QC-Chain: Comprehensive and Fast Quality Control Method for Metagenomic Data
title_fullStr Meta-QC-Chain: Comprehensive and Fast Quality Control Method for Metagenomic Data
title_full_unstemmed Meta-QC-Chain: Comprehensive and Fast Quality Control Method for Metagenomic Data
title_short Meta-QC-Chain: Comprehensive and Fast Quality Control Method for Metagenomic Data
title_sort meta-qc-chain: comprehensive and fast quality control method for metagenomic data
topic Application Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4411374/
https://www.ncbi.nlm.nih.gov/pubmed/24508279
http://dx.doi.org/10.1016/j.gpb.2014.01.002
work_keys_str_mv AT zhouqian metaqcchaincomprehensiveandfastqualitycontrolmethodformetagenomicdata
AT suxiaoquan metaqcchaincomprehensiveandfastqualitycontrolmethodformetagenomicdata
AT jinggongchao metaqcchaincomprehensiveandfastqualitycontrolmethodformetagenomicdata
AT ningkang metaqcchaincomprehensiveandfastqualitycontrolmethodformetagenomicdata