Cargando…
SeqSQC: A Bioconductor Package for Evaluating the Sample Quality of Next-generation Sequencing Data
As next-generation sequencing (NGS) technology has become widely used to identify genetic causal variants for various diseases and traits, a number of packages for checking NGS data quality have sprung up in public domains. In addition to the quality of sequencing data, sample quality issues, such a...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6620264/ https://www.ncbi.nlm.nih.gov/pubmed/30959223 http://dx.doi.org/10.1016/j.gpb.2018.07.006 |
_version_ | 1783434011067023360 |
---|---|
author | Liu, Qian Hu, Qiang Yao, Song Kwan, Marilyn L. Roh, Janise M. Zhao, Hua Ambrosone, Christine B. Kushi, Lawrence H. Liu, Song Zhu, Qianqian |
author_facet | Liu, Qian Hu, Qiang Yao, Song Kwan, Marilyn L. Roh, Janise M. Zhao, Hua Ambrosone, Christine B. Kushi, Lawrence H. Liu, Song Zhu, Qianqian |
author_sort | Liu, Qian |
collection | PubMed |
description | As next-generation sequencing (NGS) technology has become widely used to identify genetic causal variants for various diseases and traits, a number of packages for checking NGS data quality have sprung up in public domains. In addition to the quality of sequencing data, sample quality issues, such as gender mismatch, abnormal inbreeding coefficient, cryptic relatedness, and population outliers, can also have fundamental impact on downstream analysis. However, there is a lack of tools specialized in identifying problematic samples from NGS data, often due to the limitation of sample size and variant counts. We developed SeqSQC, a Bioconductor package, to automate and accelerate sample cleaning in NGS data of any scale. SeqSQC is designed for efficient data storage and access, and equipped with interactive plots for intuitive data visualization to expedite the identification of problematic samples. SeqSQC is available at http://bioconductor.org/packages/SeqSQC. |
format | Online Article Text |
id | pubmed-6620264 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-66202642019-07-22 SeqSQC: A Bioconductor Package for Evaluating the Sample Quality of Next-generation Sequencing Data Liu, Qian Hu, Qiang Yao, Song Kwan, Marilyn L. Roh, Janise M. Zhao, Hua Ambrosone, Christine B. Kushi, Lawrence H. Liu, Song Zhu, Qianqian Genomics Proteomics Bioinformatics Method As next-generation sequencing (NGS) technology has become widely used to identify genetic causal variants for various diseases and traits, a number of packages for checking NGS data quality have sprung up in public domains. In addition to the quality of sequencing data, sample quality issues, such as gender mismatch, abnormal inbreeding coefficient, cryptic relatedness, and population outliers, can also have fundamental impact on downstream analysis. However, there is a lack of tools specialized in identifying problematic samples from NGS data, often due to the limitation of sample size and variant counts. We developed SeqSQC, a Bioconductor package, to automate and accelerate sample cleaning in NGS data of any scale. SeqSQC is designed for efficient data storage and access, and equipped with interactive plots for intuitive data visualization to expedite the identification of problematic samples. SeqSQC is available at http://bioconductor.org/packages/SeqSQC. Elsevier 2019-04 2019-04-05 /pmc/articles/PMC6620264/ /pubmed/30959223 http://dx.doi.org/10.1016/j.gpb.2018.07.006 Text en © 2019 The Authors http://creativecommons.org/licenses/by/4.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Method Liu, Qian Hu, Qiang Yao, Song Kwan, Marilyn L. Roh, Janise M. Zhao, Hua Ambrosone, Christine B. Kushi, Lawrence H. Liu, Song Zhu, Qianqian SeqSQC: A Bioconductor Package for Evaluating the Sample Quality of Next-generation Sequencing Data |
title | SeqSQC: A Bioconductor Package for Evaluating the Sample Quality of Next-generation Sequencing Data |
title_full | SeqSQC: A Bioconductor Package for Evaluating the Sample Quality of Next-generation Sequencing Data |
title_fullStr | SeqSQC: A Bioconductor Package for Evaluating the Sample Quality of Next-generation Sequencing Data |
title_full_unstemmed | SeqSQC: A Bioconductor Package for Evaluating the Sample Quality of Next-generation Sequencing Data |
title_short | SeqSQC: A Bioconductor Package for Evaluating the Sample Quality of Next-generation Sequencing Data |
title_sort | seqsqc: a bioconductor package for evaluating the sample quality of next-generation sequencing data |
topic | Method |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6620264/ https://www.ncbi.nlm.nih.gov/pubmed/30959223 http://dx.doi.org/10.1016/j.gpb.2018.07.006 |
work_keys_str_mv | AT liuqian seqsqcabioconductorpackageforevaluatingthesamplequalityofnextgenerationsequencingdata AT huqiang seqsqcabioconductorpackageforevaluatingthesamplequalityofnextgenerationsequencingdata AT yaosong seqsqcabioconductorpackageforevaluatingthesamplequalityofnextgenerationsequencingdata AT kwanmarilynl seqsqcabioconductorpackageforevaluatingthesamplequalityofnextgenerationsequencingdata AT rohjanisem seqsqcabioconductorpackageforevaluatingthesamplequalityofnextgenerationsequencingdata AT zhaohua seqsqcabioconductorpackageforevaluatingthesamplequalityofnextgenerationsequencingdata AT ambrosonechristineb seqsqcabioconductorpackageforevaluatingthesamplequalityofnextgenerationsequencingdata AT kushilawrenceh seqsqcabioconductorpackageforevaluatingthesamplequalityofnextgenerationsequencingdata AT liusong seqsqcabioconductorpackageforevaluatingthesamplequalityofnextgenerationsequencingdata AT zhuqianqian seqsqcabioconductorpackageforevaluatingthesamplequalityofnextgenerationsequencingdata |