Cargando…

SeqMule: automated pipeline for analysis of human exome/genome sequencing data

Next-generation sequencing (NGS) technology has greatly helped us identify disease-contributory variants for Mendelian diseases. However, users are often faced with issues such as software compatibility, complicated configuration, and no access to high-performance computing facility. Discrepancies e...

Descripción completa

Detalles Bibliográficos
Autores principales: Guo, Yunfei, Ding, Xiaolei, Shen, Yufeng, Lyon, Gholson J., Wang, Kai
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4585643/
https://www.ncbi.nlm.nih.gov/pubmed/26381817
http://dx.doi.org/10.1038/srep14283
_version_ 1782392243511885824
author Guo, Yunfei
Ding, Xiaolei
Shen, Yufeng
Lyon, Gholson J.
Wang, Kai
author_facet Guo, Yunfei
Ding, Xiaolei
Shen, Yufeng
Lyon, Gholson J.
Wang, Kai
author_sort Guo, Yunfei
collection PubMed
description Next-generation sequencing (NGS) technology has greatly helped us identify disease-contributory variants for Mendelian diseases. However, users are often faced with issues such as software compatibility, complicated configuration, and no access to high-performance computing facility. Discrepancies exist among aligners and variant callers. We developed a computational pipeline, SeqMule, to perform automated variant calling from NGS data on human genomes and exomes. SeqMule integrates computational-cluster-free parallelization capability built on top of the variant callers, and facilitates normalization/intersection of variant calls to generate consensus set with high confidence. SeqMule integrates 5 alignment tools, 5 variant calling algorithms and accepts various combinations all by one-line command, therefore allowing highly flexible yet fully automated variant calling. In a modern machine (2 Intel Xeon X5650 CPUs, 48 GB memory), when fast turn-around is needed, SeqMule generates annotated VCF files in a day from a 30X whole-genome sequencing data set; when more accurate calling is needed, SeqMule generates consensus call set that improves over single callers, as measured by both Mendelian error rate and consistency. SeqMule supports Sun Grid Engine for parallel processing, offers turn-key solution for deployment on Amazon Web Services, allows quality check, Mendelian error check, consistency evaluation, HTML-based reports. SeqMule is available at http://seqmule.openbioinformatics.org.
format Online
Article
Text
id pubmed-4585643
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-45856432015-09-29 SeqMule: automated pipeline for analysis of human exome/genome sequencing data Guo, Yunfei Ding, Xiaolei Shen, Yufeng Lyon, Gholson J. Wang, Kai Sci Rep Article Next-generation sequencing (NGS) technology has greatly helped us identify disease-contributory variants for Mendelian diseases. However, users are often faced with issues such as software compatibility, complicated configuration, and no access to high-performance computing facility. Discrepancies exist among aligners and variant callers. We developed a computational pipeline, SeqMule, to perform automated variant calling from NGS data on human genomes and exomes. SeqMule integrates computational-cluster-free parallelization capability built on top of the variant callers, and facilitates normalization/intersection of variant calls to generate consensus set with high confidence. SeqMule integrates 5 alignment tools, 5 variant calling algorithms and accepts various combinations all by one-line command, therefore allowing highly flexible yet fully automated variant calling. In a modern machine (2 Intel Xeon X5650 CPUs, 48 GB memory), when fast turn-around is needed, SeqMule generates annotated VCF files in a day from a 30X whole-genome sequencing data set; when more accurate calling is needed, SeqMule generates consensus call set that improves over single callers, as measured by both Mendelian error rate and consistency. SeqMule supports Sun Grid Engine for parallel processing, offers turn-key solution for deployment on Amazon Web Services, allows quality check, Mendelian error check, consistency evaluation, HTML-based reports. SeqMule is available at http://seqmule.openbioinformatics.org. Nature Publishing Group 2015-09-18 /pmc/articles/PMC4585643/ /pubmed/26381817 http://dx.doi.org/10.1038/srep14283 Text en Copyright © 2015, Macmillan Publishers Limited http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Guo, Yunfei
Ding, Xiaolei
Shen, Yufeng
Lyon, Gholson J.
Wang, Kai
SeqMule: automated pipeline for analysis of human exome/genome sequencing data
title SeqMule: automated pipeline for analysis of human exome/genome sequencing data
title_full SeqMule: automated pipeline for analysis of human exome/genome sequencing data
title_fullStr SeqMule: automated pipeline for analysis of human exome/genome sequencing data
title_full_unstemmed SeqMule: automated pipeline for analysis of human exome/genome sequencing data
title_short SeqMule: automated pipeline for analysis of human exome/genome sequencing data
title_sort seqmule: automated pipeline for analysis of human exome/genome sequencing data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4585643/
https://www.ncbi.nlm.nih.gov/pubmed/26381817
http://dx.doi.org/10.1038/srep14283
work_keys_str_mv AT guoyunfei seqmuleautomatedpipelineforanalysisofhumanexomegenomesequencingdata
AT dingxiaolei seqmuleautomatedpipelineforanalysisofhumanexomegenomesequencingdata
AT shenyufeng seqmuleautomatedpipelineforanalysisofhumanexomegenomesequencingdata
AT lyongholsonj seqmuleautomatedpipelineforanalysisofhumanexomegenomesequencingdata
AT wangkai seqmuleautomatedpipelineforanalysisofhumanexomegenomesequencingdata