Cargando…

SeqSQC: A Bioconductor Package for Evaluating the Sample Quality of Next-generation Sequencing Data

As next-generation sequencing (NGS) technology has become widely used to identify genetic causal variants for various diseases and traits, a number of packages for checking NGS data quality have sprung up in public domains. In addition to the quality of sequencing data, sample quality issues, such a...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Qian, Hu, Qiang, Yao, Song, Kwan, Marilyn L., Roh, Janise M., Zhao, Hua, Ambrosone, Christine B., Kushi, Lawrence H., Liu, Song, Zhu, Qianqian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6620264/
https://www.ncbi.nlm.nih.gov/pubmed/30959223
http://dx.doi.org/10.1016/j.gpb.2018.07.006
_version_ 1783434011067023360
author Liu, Qian
Hu, Qiang
Yao, Song
Kwan, Marilyn L.
Roh, Janise M.
Zhao, Hua
Ambrosone, Christine B.
Kushi, Lawrence H.
Liu, Song
Zhu, Qianqian
author_facet Liu, Qian
Hu, Qiang
Yao, Song
Kwan, Marilyn L.
Roh, Janise M.
Zhao, Hua
Ambrosone, Christine B.
Kushi, Lawrence H.
Liu, Song
Zhu, Qianqian
author_sort Liu, Qian
collection PubMed
description As next-generation sequencing (NGS) technology has become widely used to identify genetic causal variants for various diseases and traits, a number of packages for checking NGS data quality have sprung up in public domains. In addition to the quality of sequencing data, sample quality issues, such as gender mismatch, abnormal inbreeding coefficient, cryptic relatedness, and population outliers, can also have fundamental impact on downstream analysis. However, there is a lack of tools specialized in identifying problematic samples from NGS data, often due to the limitation of sample size and variant counts. We developed SeqSQC, a Bioconductor package, to automate and accelerate sample cleaning in NGS data of any scale. SeqSQC is designed for efficient data storage and access, and equipped with interactive plots for intuitive data visualization to expedite the identification of problematic samples. SeqSQC is available at http://bioconductor.org/packages/SeqSQC.
format Online
Article
Text
id pubmed-6620264
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-66202642019-07-22 SeqSQC: A Bioconductor Package for Evaluating the Sample Quality of Next-generation Sequencing Data Liu, Qian Hu, Qiang Yao, Song Kwan, Marilyn L. Roh, Janise M. Zhao, Hua Ambrosone, Christine B. Kushi, Lawrence H. Liu, Song Zhu, Qianqian Genomics Proteomics Bioinformatics Method As next-generation sequencing (NGS) technology has become widely used to identify genetic causal variants for various diseases and traits, a number of packages for checking NGS data quality have sprung up in public domains. In addition to the quality of sequencing data, sample quality issues, such as gender mismatch, abnormal inbreeding coefficient, cryptic relatedness, and population outliers, can also have fundamental impact on downstream analysis. However, there is a lack of tools specialized in identifying problematic samples from NGS data, often due to the limitation of sample size and variant counts. We developed SeqSQC, a Bioconductor package, to automate and accelerate sample cleaning in NGS data of any scale. SeqSQC is designed for efficient data storage and access, and equipped with interactive plots for intuitive data visualization to expedite the identification of problematic samples. SeqSQC is available at http://bioconductor.org/packages/SeqSQC. Elsevier 2019-04 2019-04-05 /pmc/articles/PMC6620264/ /pubmed/30959223 http://dx.doi.org/10.1016/j.gpb.2018.07.006 Text en © 2019 The Authors http://creativecommons.org/licenses/by/4.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Method
Liu, Qian
Hu, Qiang
Yao, Song
Kwan, Marilyn L.
Roh, Janise M.
Zhao, Hua
Ambrosone, Christine B.
Kushi, Lawrence H.
Liu, Song
Zhu, Qianqian
SeqSQC: A Bioconductor Package for Evaluating the Sample Quality of Next-generation Sequencing Data
title SeqSQC: A Bioconductor Package for Evaluating the Sample Quality of Next-generation Sequencing Data
title_full SeqSQC: A Bioconductor Package for Evaluating the Sample Quality of Next-generation Sequencing Data
title_fullStr SeqSQC: A Bioconductor Package for Evaluating the Sample Quality of Next-generation Sequencing Data
title_full_unstemmed SeqSQC: A Bioconductor Package for Evaluating the Sample Quality of Next-generation Sequencing Data
title_short SeqSQC: A Bioconductor Package for Evaluating the Sample Quality of Next-generation Sequencing Data
title_sort seqsqc: a bioconductor package for evaluating the sample quality of next-generation sequencing data
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6620264/
https://www.ncbi.nlm.nih.gov/pubmed/30959223
http://dx.doi.org/10.1016/j.gpb.2018.07.006
work_keys_str_mv AT liuqian seqsqcabioconductorpackageforevaluatingthesamplequalityofnextgenerationsequencingdata
AT huqiang seqsqcabioconductorpackageforevaluatingthesamplequalityofnextgenerationsequencingdata
AT yaosong seqsqcabioconductorpackageforevaluatingthesamplequalityofnextgenerationsequencingdata
AT kwanmarilynl seqsqcabioconductorpackageforevaluatingthesamplequalityofnextgenerationsequencingdata
AT rohjanisem seqsqcabioconductorpackageforevaluatingthesamplequalityofnextgenerationsequencingdata
AT zhaohua seqsqcabioconductorpackageforevaluatingthesamplequalityofnextgenerationsequencingdata
AT ambrosonechristineb seqsqcabioconductorpackageforevaluatingthesamplequalityofnextgenerationsequencingdata
AT kushilawrenceh seqsqcabioconductorpackageforevaluatingthesamplequalityofnextgenerationsequencingdata
AT liusong seqsqcabioconductorpackageforevaluatingthesamplequalityofnextgenerationsequencingdata
AT zhuqianqian seqsqcabioconductorpackageforevaluatingthesamplequalityofnextgenerationsequencingdata