Cargando…

FASTQuick: rapid and comprehensive quality assessment of raw sequence reads

BACKGROUND: Rapid and thorough quality assessment of sequenced genomes on an ultra-high-throughput scale is crucial for successful large-scale genomic studies. Comprehensive quality assessment typically requires full genome alignment, which costs a substantial amount of computational resources and t...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Fan, Kang, Hyun Min
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7844880/
https://www.ncbi.nlm.nih.gov/pubmed/33511994
http://dx.doi.org/10.1093/gigascience/giab004
_version_ 1783644445285023744
author Zhang, Fan
Kang, Hyun Min
author_facet Zhang, Fan
Kang, Hyun Min
author_sort Zhang, Fan
collection PubMed
description BACKGROUND: Rapid and thorough quality assessment of sequenced genomes on an ultra-high-throughput scale is crucial for successful large-scale genomic studies. Comprehensive quality assessment typically requires full genome alignment, which costs a substantial amount of computational resources and turnaround time. Existing tools are either computationally expensive owing to full alignment or lacking essential quality metrics by skipping read alignment. FINDINGS: We developed a set of rapid and accurate methods to produce comprehensive quality metrics directly from a subset of raw sequence reads (from whole-genome or whole-exome sequencing) without full alignment. Our methods offer orders of magnitude faster turnaround time than existing full alignment–based methods while providing comprehensive and sophisticated quality metrics, including estimates of genetic ancestry and cross-sample contamination. CONCLUSIONS: By rapidly and comprehensively performing the quality assessment, our tool will help investigators detect potential issues in ultra-high-throughput sequence reads in real time within a low computational cost at the early stages of the analyses, ensuring high-quality downstream results and preventing unexpected loss in time, money, and invaluable specimens.
format Online
Article
Text
id pubmed-7844880
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-78448802021-02-03 FASTQuick: rapid and comprehensive quality assessment of raw sequence reads Zhang, Fan Kang, Hyun Min Gigascience Technical Note BACKGROUND: Rapid and thorough quality assessment of sequenced genomes on an ultra-high-throughput scale is crucial for successful large-scale genomic studies. Comprehensive quality assessment typically requires full genome alignment, which costs a substantial amount of computational resources and turnaround time. Existing tools are either computationally expensive owing to full alignment or lacking essential quality metrics by skipping read alignment. FINDINGS: We developed a set of rapid and accurate methods to produce comprehensive quality metrics directly from a subset of raw sequence reads (from whole-genome or whole-exome sequencing) without full alignment. Our methods offer orders of magnitude faster turnaround time than existing full alignment–based methods while providing comprehensive and sophisticated quality metrics, including estimates of genetic ancestry and cross-sample contamination. CONCLUSIONS: By rapidly and comprehensively performing the quality assessment, our tool will help investigators detect potential issues in ultra-high-throughput sequence reads in real time within a low computational cost at the early stages of the analyses, ensuring high-quality downstream results and preventing unexpected loss in time, money, and invaluable specimens. Oxford University Press 2021-01-29 /pmc/articles/PMC7844880/ /pubmed/33511994 http://dx.doi.org/10.1093/gigascience/giab004 Text en © The Author(s) 2021. Published by Oxford University Press GigaScience. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Technical Note
Zhang, Fan
Kang, Hyun Min
FASTQuick: rapid and comprehensive quality assessment of raw sequence reads
title FASTQuick: rapid and comprehensive quality assessment of raw sequence reads
title_full FASTQuick: rapid and comprehensive quality assessment of raw sequence reads
title_fullStr FASTQuick: rapid and comprehensive quality assessment of raw sequence reads
title_full_unstemmed FASTQuick: rapid and comprehensive quality assessment of raw sequence reads
title_short FASTQuick: rapid and comprehensive quality assessment of raw sequence reads
title_sort fastquick: rapid and comprehensive quality assessment of raw sequence reads
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7844880/
https://www.ncbi.nlm.nih.gov/pubmed/33511994
http://dx.doi.org/10.1093/gigascience/giab004
work_keys_str_mv AT zhangfan fastquickrapidandcomprehensivequalityassessmentofrawsequencereads
AT kanghyunmin fastquickrapidandcomprehensivequalityassessmentofrawsequencereads