Cargando…

Selection of Optimal Bioinformatic Tools and Proper Reference for Reducing the Alignment Error in Targeted Sequencing Data

BACKGROUND: Careful design in the primary steps of a next-generation sequencing study is critical for obtaining successful results in downstream analysis. METHODS: In this study, a framework is proposed to evaluate and improve the sequence mapping in targeted regions of the reference genome. In this...

Descripción completa

Detalles Bibliográficos
Autores principales: Nodehi, Hannane Mohammadi, Tabatabaiefar, Mohammad Amin, Sehhati, Mohammadreza
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Wolters Kluwer - Medknow 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8043119/
https://www.ncbi.nlm.nih.gov/pubmed/34026589
http://dx.doi.org/10.4103/jmss.JMSS_7_20
_version_ 1783678254092124160
author Nodehi, Hannane Mohammadi
Tabatabaiefar, Mohammad Amin
Sehhati, Mohammadreza
author_facet Nodehi, Hannane Mohammadi
Tabatabaiefar, Mohammad Amin
Sehhati, Mohammadreza
author_sort Nodehi, Hannane Mohammadi
collection PubMed
description BACKGROUND: Careful design in the primary steps of a next-generation sequencing study is critical for obtaining successful results in downstream analysis. METHODS: In this study, a framework is proposed to evaluate and improve the sequence mapping in targeted regions of the reference genome. In this regard, simulated short reads were produced from the coding regions of the human genome and mapped to a Customized Target-Based Reference (CTBR) by the alignment tools that have been introduced recently. The short reads produced by different sequencing technologies aligned to the standard genome and also CTBR with and without well-defined mutation types where the amount of unmapped and misaligned reads and runtime was measured for comparison. RESULTS: The results showed that the mapping accuracy of the reads generated from Illumina Hiseq2500 using Stampy as the alignment tool whenever the CTBR was used as reference was significantly better than other evaluated pipelines. Using CTBR for alignment significantly decreased the mapping error in comparison to other expanded or more limited references. While intentional mutations were imported in the reads, Stampy showed the minimum error of 1.67% using CTBR. However, the lowest error obtained by stampy too using whole genome and one chromosome as references was 3.78% and 20%, respectively. Maximum and minimum misalignment errors were observed on chromosome Y and 20, respectively. CONCLUSION: Therefore using the proposed framework in a clinical targeted sequencing study may lead to predict the error and improve the performance of variant calling regarding the genomic regions targeted in a clinical study.
format Online
Article
Text
id pubmed-8043119
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Wolters Kluwer - Medknow
record_format MEDLINE/PubMed
spelling pubmed-80431192021-05-21 Selection of Optimal Bioinformatic Tools and Proper Reference for Reducing the Alignment Error in Targeted Sequencing Data Nodehi, Hannane Mohammadi Tabatabaiefar, Mohammad Amin Sehhati, Mohammadreza J Med Signals Sens Original Article BACKGROUND: Careful design in the primary steps of a next-generation sequencing study is critical for obtaining successful results in downstream analysis. METHODS: In this study, a framework is proposed to evaluate and improve the sequence mapping in targeted regions of the reference genome. In this regard, simulated short reads were produced from the coding regions of the human genome and mapped to a Customized Target-Based Reference (CTBR) by the alignment tools that have been introduced recently. The short reads produced by different sequencing technologies aligned to the standard genome and also CTBR with and without well-defined mutation types where the amount of unmapped and misaligned reads and runtime was measured for comparison. RESULTS: The results showed that the mapping accuracy of the reads generated from Illumina Hiseq2500 using Stampy as the alignment tool whenever the CTBR was used as reference was significantly better than other evaluated pipelines. Using CTBR for alignment significantly decreased the mapping error in comparison to other expanded or more limited references. While intentional mutations were imported in the reads, Stampy showed the minimum error of 1.67% using CTBR. However, the lowest error obtained by stampy too using whole genome and one chromosome as references was 3.78% and 20%, respectively. Maximum and minimum misalignment errors were observed on chromosome Y and 20, respectively. CONCLUSION: Therefore using the proposed framework in a clinical targeted sequencing study may lead to predict the error and improve the performance of variant calling regarding the genomic regions targeted in a clinical study. Wolters Kluwer - Medknow 2021-01-30 /pmc/articles/PMC8043119/ /pubmed/34026589 http://dx.doi.org/10.4103/jmss.JMSS_7_20 Text en Copyright: © 2021 Journal of Medical Signals & Sensors https://creativecommons.org/licenses/by-nc-sa/4.0/This is an open access journal, and articles are distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 License, which allows others to remix, tweak, and build upon the work non-commercially, as long as appropriate credit is given and the new creations are licensed under the identical terms.
spellingShingle Original Article
Nodehi, Hannane Mohammadi
Tabatabaiefar, Mohammad Amin
Sehhati, Mohammadreza
Selection of Optimal Bioinformatic Tools and Proper Reference for Reducing the Alignment Error in Targeted Sequencing Data
title Selection of Optimal Bioinformatic Tools and Proper Reference for Reducing the Alignment Error in Targeted Sequencing Data
title_full Selection of Optimal Bioinformatic Tools and Proper Reference for Reducing the Alignment Error in Targeted Sequencing Data
title_fullStr Selection of Optimal Bioinformatic Tools and Proper Reference for Reducing the Alignment Error in Targeted Sequencing Data
title_full_unstemmed Selection of Optimal Bioinformatic Tools and Proper Reference for Reducing the Alignment Error in Targeted Sequencing Data
title_short Selection of Optimal Bioinformatic Tools and Proper Reference for Reducing the Alignment Error in Targeted Sequencing Data
title_sort selection of optimal bioinformatic tools and proper reference for reducing the alignment error in targeted sequencing data
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8043119/
https://www.ncbi.nlm.nih.gov/pubmed/34026589
http://dx.doi.org/10.4103/jmss.JMSS_7_20
work_keys_str_mv AT nodehihannanemohammadi selectionofoptimalbioinformatictoolsandproperreferenceforreducingthealignmenterrorintargetedsequencingdata
AT tabatabaiefarmohammadamin selectionofoptimalbioinformatictoolsandproperreferenceforreducingthealignmenterrorintargetedsequencingdata
AT sehhatimohammadreza selectionofoptimalbioinformatictoolsandproperreferenceforreducingthealignmenterrorintargetedsequencingdata