Cargando…

Evaluating alignment and variant-calling software for mutation identification in C. elegans by whole-genome sequencing

Whole-genome sequencing is a powerful tool for analyzing genetic variation on a global scale. One particularly useful application is the identification of mutations obtained by classical phenotypic screens in model species. Sequence data from the mutant strain is aligned to the reference genome, and...

Descripción completa

Detalles Bibliográficos
Autores principales: Smith, Harold E., Yun, Sijung
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5363872/
https://www.ncbi.nlm.nih.gov/pubmed/28333980
http://dx.doi.org/10.1371/journal.pone.0174446
_version_ 1782517225948708864
author Smith, Harold E.
Yun, Sijung
author_facet Smith, Harold E.
Yun, Sijung
author_sort Smith, Harold E.
collection PubMed
description Whole-genome sequencing is a powerful tool for analyzing genetic variation on a global scale. One particularly useful application is the identification of mutations obtained by classical phenotypic screens in model species. Sequence data from the mutant strain is aligned to the reference genome, and then variants are called to generate a list of candidate alleles. A number of software pipelines for mutation identification have been targeted to C. elegans, with particular emphasis on ease of use, incorporation of mapping strain data, subtraction of background variants, and similar criteria. Although success is predicated upon the sensitive and accurate detection of candidate alleles, relatively little effort has been invested in evaluating the underlying software components that are required for mutation identification. Therefore, we have benchmarked a number of commonly used tools for sequence alignment and variant calling, in all pair-wise combinations, against both simulated and actual datasets. We compared the accuracy of those pipelines for mutation identification in C. elegans, and found that the combination of BBMap for alignment plus FreeBayes for variant calling offers the most robust performance.
format Online
Article
Text
id pubmed-5363872
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-53638722017-04-06 Evaluating alignment and variant-calling software for mutation identification in C. elegans by whole-genome sequencing Smith, Harold E. Yun, Sijung PLoS One Research Article Whole-genome sequencing is a powerful tool for analyzing genetic variation on a global scale. One particularly useful application is the identification of mutations obtained by classical phenotypic screens in model species. Sequence data from the mutant strain is aligned to the reference genome, and then variants are called to generate a list of candidate alleles. A number of software pipelines for mutation identification have been targeted to C. elegans, with particular emphasis on ease of use, incorporation of mapping strain data, subtraction of background variants, and similar criteria. Although success is predicated upon the sensitive and accurate detection of candidate alleles, relatively little effort has been invested in evaluating the underlying software components that are required for mutation identification. Therefore, we have benchmarked a number of commonly used tools for sequence alignment and variant calling, in all pair-wise combinations, against both simulated and actual datasets. We compared the accuracy of those pipelines for mutation identification in C. elegans, and found that the combination of BBMap for alignment plus FreeBayes for variant calling offers the most robust performance. Public Library of Science 2017-03-23 /pmc/articles/PMC5363872/ /pubmed/28333980 http://dx.doi.org/10.1371/journal.pone.0174446 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 (https://creativecommons.org/publicdomain/zero/1.0/) public domain dedication.
spellingShingle Research Article
Smith, Harold E.
Yun, Sijung
Evaluating alignment and variant-calling software for mutation identification in C. elegans by whole-genome sequencing
title Evaluating alignment and variant-calling software for mutation identification in C. elegans by whole-genome sequencing
title_full Evaluating alignment and variant-calling software for mutation identification in C. elegans by whole-genome sequencing
title_fullStr Evaluating alignment and variant-calling software for mutation identification in C. elegans by whole-genome sequencing
title_full_unstemmed Evaluating alignment and variant-calling software for mutation identification in C. elegans by whole-genome sequencing
title_short Evaluating alignment and variant-calling software for mutation identification in C. elegans by whole-genome sequencing
title_sort evaluating alignment and variant-calling software for mutation identification in c. elegans by whole-genome sequencing
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5363872/
https://www.ncbi.nlm.nih.gov/pubmed/28333980
http://dx.doi.org/10.1371/journal.pone.0174446
work_keys_str_mv AT smithharolde evaluatingalignmentandvariantcallingsoftwareformutationidentificationincelegansbywholegenomesequencing
AT yunsijung evaluatingalignmentandvariantcallingsoftwareformutationidentificationincelegansbywholegenomesequencing