Cargando…

Evaluating Structural Variation Detection Tools for Long-Read Sequencing Datasets in Saccharomyces cerevisiae

Structural variation (SV) represents a major form of genetic variations that contribute to polymorphic variations, human diseases, and phenotypes in many organisms. Long-read sequencing has been successfully used to identify novel and complex SVs. However, comparison of SV detection tools for long-r...

Descripción completa

Detalles Bibliográficos
Autores principales: Luan, Mei-Wei, Zhang, Xiao-Ming, Zhu, Zi-Bin, Chen, Ying, Xie, Shang-Qian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7075250/
https://www.ncbi.nlm.nih.gov/pubmed/32211024
http://dx.doi.org/10.3389/fgene.2020.00159
_version_ 1783507004150513664
author Luan, Mei-Wei
Zhang, Xiao-Ming
Zhu, Zi-Bin
Chen, Ying
Xie, Shang-Qian
author_facet Luan, Mei-Wei
Zhang, Xiao-Ming
Zhu, Zi-Bin
Chen, Ying
Xie, Shang-Qian
author_sort Luan, Mei-Wei
collection PubMed
description Structural variation (SV) represents a major form of genetic variations that contribute to polymorphic variations, human diseases, and phenotypes in many organisms. Long-read sequencing has been successfully used to identify novel and complex SVs. However, comparison of SV detection tools for long-read sequencing datasets has not been reported. Therefore, we developed an analysis workflow that combined two alignment tools (NGMLR and minimap2) and five callers (Sniffles, Picky, smartie-sv, PBHoney, and NanoSV) to evaluate the SV detection in six datasets of Saccharomyces cerevisiae. The accuracy of SV regions was validated by re-aligning raw reads in diverse alignment tools, SV callers, experimental conditions, and sequencing platforms. The results showed that SV detection between NGMLR and minimap2 was not significant when using the same caller. The PBHoney was with the highest average accuracy (89.04%) and Picky has the lowest average accuracy (35.85%). The accuracy of NanoSV, Sniffles, and smartie-sv was 68.67%, 60.47%, and 57.67%, respectively. In addition, smartie-sv and NanoSV detected the most and least number of SVs, and SV detection from the PacBio sequencing platform was significantly more than that from ONT (p = 0.000173).
format Online
Article
Text
id pubmed-7075250
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-70752502020-03-24 Evaluating Structural Variation Detection Tools for Long-Read Sequencing Datasets in Saccharomyces cerevisiae Luan, Mei-Wei Zhang, Xiao-Ming Zhu, Zi-Bin Chen, Ying Xie, Shang-Qian Front Genet Genetics Structural variation (SV) represents a major form of genetic variations that contribute to polymorphic variations, human diseases, and phenotypes in many organisms. Long-read sequencing has been successfully used to identify novel and complex SVs. However, comparison of SV detection tools for long-read sequencing datasets has not been reported. Therefore, we developed an analysis workflow that combined two alignment tools (NGMLR and minimap2) and five callers (Sniffles, Picky, smartie-sv, PBHoney, and NanoSV) to evaluate the SV detection in six datasets of Saccharomyces cerevisiae. The accuracy of SV regions was validated by re-aligning raw reads in diverse alignment tools, SV callers, experimental conditions, and sequencing platforms. The results showed that SV detection between NGMLR and minimap2 was not significant when using the same caller. The PBHoney was with the highest average accuracy (89.04%) and Picky has the lowest average accuracy (35.85%). The accuracy of NanoSV, Sniffles, and smartie-sv was 68.67%, 60.47%, and 57.67%, respectively. In addition, smartie-sv and NanoSV detected the most and least number of SVs, and SV detection from the PacBio sequencing platform was significantly more than that from ONT (p = 0.000173). Frontiers Media S.A. 2020-03-09 /pmc/articles/PMC7075250/ /pubmed/32211024 http://dx.doi.org/10.3389/fgene.2020.00159 Text en Copyright © 2020 Luan, Zhang, Zhu, Chen and Xie http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Luan, Mei-Wei
Zhang, Xiao-Ming
Zhu, Zi-Bin
Chen, Ying
Xie, Shang-Qian
Evaluating Structural Variation Detection Tools for Long-Read Sequencing Datasets in Saccharomyces cerevisiae
title Evaluating Structural Variation Detection Tools for Long-Read Sequencing Datasets in Saccharomyces cerevisiae
title_full Evaluating Structural Variation Detection Tools for Long-Read Sequencing Datasets in Saccharomyces cerevisiae
title_fullStr Evaluating Structural Variation Detection Tools for Long-Read Sequencing Datasets in Saccharomyces cerevisiae
title_full_unstemmed Evaluating Structural Variation Detection Tools for Long-Read Sequencing Datasets in Saccharomyces cerevisiae
title_short Evaluating Structural Variation Detection Tools for Long-Read Sequencing Datasets in Saccharomyces cerevisiae
title_sort evaluating structural variation detection tools for long-read sequencing datasets in saccharomyces cerevisiae
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7075250/
https://www.ncbi.nlm.nih.gov/pubmed/32211024
http://dx.doi.org/10.3389/fgene.2020.00159
work_keys_str_mv AT luanmeiwei evaluatingstructuralvariationdetectiontoolsforlongreadsequencingdatasetsinsaccharomycescerevisiae
AT zhangxiaoming evaluatingstructuralvariationdetectiontoolsforlongreadsequencingdatasetsinsaccharomycescerevisiae
AT zhuzibin evaluatingstructuralvariationdetectiontoolsforlongreadsequencingdatasetsinsaccharomycescerevisiae
AT chenying evaluatingstructuralvariationdetectiontoolsforlongreadsequencingdatasetsinsaccharomycescerevisiae
AT xieshangqian evaluatingstructuralvariationdetectiontoolsforlongreadsequencingdatasetsinsaccharomycescerevisiae