Cargando…

Comparison of Read Mapping and Variant Calling Tools for the Analysis of Plant NGS Data

High-throughput sequencing technologies have rapidly developed during the past years and have become an essential tool in plant sciences. However, the analysis of genomic data remains challenging and relies mostly on the performance of automatic pipelines. Frequently applied pipelines involve the al...

Descripción completa

Detalles Bibliográficos
Autores principales: Schilbert, Hanna Marie, Rempel, Andreas, Pucker, Boas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7238416/
https://www.ncbi.nlm.nih.gov/pubmed/32252268
http://dx.doi.org/10.3390/plants9040439
_version_ 1783536533619343360
author Schilbert, Hanna Marie
Rempel, Andreas
Pucker, Boas
author_facet Schilbert, Hanna Marie
Rempel, Andreas
Pucker, Boas
author_sort Schilbert, Hanna Marie
collection PubMed
description High-throughput sequencing technologies have rapidly developed during the past years and have become an essential tool in plant sciences. However, the analysis of genomic data remains challenging and relies mostly on the performance of automatic pipelines. Frequently applied pipelines involve the alignment of sequence reads against a reference sequence and the identification of sequence variants. Since most benchmarking studies of bioinformatics tools for this purpose have been conducted on human datasets, there is a lack of benchmarking studies in plant sciences. In this study, we evaluated the performance of 50 different variant calling pipelines, including five read mappers and ten variant callers, on six real plant datasets of the model organism Arabidopsis thaliana. Sets of variants were evaluated based on various parameters including sensitivity and specificity. We found that all investigated tools are suitable for analysis of NGS data in plant research. When looking at different performance metrics, BWA-MEM and Novoalign were the best mappers and GATK returned the best results in the variant calling step.
format Online
Article
Text
id pubmed-7238416
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-72384162020-06-02 Comparison of Read Mapping and Variant Calling Tools for the Analysis of Plant NGS Data Schilbert, Hanna Marie Rempel, Andreas Pucker, Boas Plants (Basel) Article High-throughput sequencing technologies have rapidly developed during the past years and have become an essential tool in plant sciences. However, the analysis of genomic data remains challenging and relies mostly on the performance of automatic pipelines. Frequently applied pipelines involve the alignment of sequence reads against a reference sequence and the identification of sequence variants. Since most benchmarking studies of bioinformatics tools for this purpose have been conducted on human datasets, there is a lack of benchmarking studies in plant sciences. In this study, we evaluated the performance of 50 different variant calling pipelines, including five read mappers and ten variant callers, on six real plant datasets of the model organism Arabidopsis thaliana. Sets of variants were evaluated based on various parameters including sensitivity and specificity. We found that all investigated tools are suitable for analysis of NGS data in plant research. When looking at different performance metrics, BWA-MEM and Novoalign were the best mappers and GATK returned the best results in the variant calling step. MDPI 2020-04-02 /pmc/articles/PMC7238416/ /pubmed/32252268 http://dx.doi.org/10.3390/plants9040439 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Schilbert, Hanna Marie
Rempel, Andreas
Pucker, Boas
Comparison of Read Mapping and Variant Calling Tools for the Analysis of Plant NGS Data
title Comparison of Read Mapping and Variant Calling Tools for the Analysis of Plant NGS Data
title_full Comparison of Read Mapping and Variant Calling Tools for the Analysis of Plant NGS Data
title_fullStr Comparison of Read Mapping and Variant Calling Tools for the Analysis of Plant NGS Data
title_full_unstemmed Comparison of Read Mapping and Variant Calling Tools for the Analysis of Plant NGS Data
title_short Comparison of Read Mapping and Variant Calling Tools for the Analysis of Plant NGS Data
title_sort comparison of read mapping and variant calling tools for the analysis of plant ngs data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7238416/
https://www.ncbi.nlm.nih.gov/pubmed/32252268
http://dx.doi.org/10.3390/plants9040439
work_keys_str_mv AT schilberthannamarie comparisonofreadmappingandvariantcallingtoolsfortheanalysisofplantngsdata
AT rempelandreas comparisonofreadmappingandvariantcallingtoolsfortheanalysisofplantngsdata
AT puckerboas comparisonofreadmappingandvariantcallingtoolsfortheanalysisofplantngsdata