Cargando…

K-Mer Spectrum-Based Error Correction Algorithm for Next-Generation Sequencing Data

In the mid-1970s, the first-generation sequencing technique (Sanger) was created. It used Advanced BioSystems sequencing devices and Beckman's GeXP genetic testing technology. The second-generation sequencing (2GS) technique arrived just several years after the first human genome was published...

Descripción completa

Detalles Bibliográficos
Autores principales: AlEisa, Hussah N., Hamad, Safwat, Elhadad, Ahmed
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9303089/
https://www.ncbi.nlm.nih.gov/pubmed/35875730
http://dx.doi.org/10.1155/2022/8077664
_version_ 1784751776070631424
author AlEisa, Hussah N.
Hamad, Safwat
Elhadad, Ahmed
author_facet AlEisa, Hussah N.
Hamad, Safwat
Elhadad, Ahmed
author_sort AlEisa, Hussah N.
collection PubMed
description In the mid-1970s, the first-generation sequencing technique (Sanger) was created. It used Advanced BioSystems sequencing devices and Beckman's GeXP genetic testing technology. The second-generation sequencing (2GS) technique arrived just several years after the first human genome was published in 2003. 2GS devices are very quicker than Sanger sequencing equipment, with considerably cheaper manufacturing costs and far higher throughput in the form of short reads. The third-generation sequencing (3GS) method, initially introduced in 2005, offers further reduced manufacturing costs and higher throughput. Even though sequencing technique has result generations, it is error-prone due to a large number of reads. The study of this massive amount of data will aid in the decoding of life secrets, the detection of infections, the development of improved crops, and the improvement of life quality, among other things. This is a challenging task, which is complicated not just by a large number of reads and by the occurrence of sequencing mistakes. As a result, error correction is a crucial duty in data processing; it entails identifying and correcting read errors. Various k-spectrum-based error correction algorithms' performance can be influenced by a variety of characteristics like coverage depth, read length, and genome size, as demonstrated in this work. As a result, time and effort must be put into selecting acceptable approaches for error correction of certain NGS data.
format Online
Article
Text
id pubmed-9303089
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-93030892022-07-22 K-Mer Spectrum-Based Error Correction Algorithm for Next-Generation Sequencing Data AlEisa, Hussah N. Hamad, Safwat Elhadad, Ahmed Comput Intell Neurosci Research Article In the mid-1970s, the first-generation sequencing technique (Sanger) was created. It used Advanced BioSystems sequencing devices and Beckman's GeXP genetic testing technology. The second-generation sequencing (2GS) technique arrived just several years after the first human genome was published in 2003. 2GS devices are very quicker than Sanger sequencing equipment, with considerably cheaper manufacturing costs and far higher throughput in the form of short reads. The third-generation sequencing (3GS) method, initially introduced in 2005, offers further reduced manufacturing costs and higher throughput. Even though sequencing technique has result generations, it is error-prone due to a large number of reads. The study of this massive amount of data will aid in the decoding of life secrets, the detection of infections, the development of improved crops, and the improvement of life quality, among other things. This is a challenging task, which is complicated not just by a large number of reads and by the occurrence of sequencing mistakes. As a result, error correction is a crucial duty in data processing; it entails identifying and correcting read errors. Various k-spectrum-based error correction algorithms' performance can be influenced by a variety of characteristics like coverage depth, read length, and genome size, as demonstrated in this work. As a result, time and effort must be put into selecting acceptable approaches for error correction of certain NGS data. Hindawi 2022-07-14 /pmc/articles/PMC9303089/ /pubmed/35875730 http://dx.doi.org/10.1155/2022/8077664 Text en Copyright © 2022 Hussah N. AlEisa et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
AlEisa, Hussah N.
Hamad, Safwat
Elhadad, Ahmed
K-Mer Spectrum-Based Error Correction Algorithm for Next-Generation Sequencing Data
title K-Mer Spectrum-Based Error Correction Algorithm for Next-Generation Sequencing Data
title_full K-Mer Spectrum-Based Error Correction Algorithm for Next-Generation Sequencing Data
title_fullStr K-Mer Spectrum-Based Error Correction Algorithm for Next-Generation Sequencing Data
title_full_unstemmed K-Mer Spectrum-Based Error Correction Algorithm for Next-Generation Sequencing Data
title_short K-Mer Spectrum-Based Error Correction Algorithm for Next-Generation Sequencing Data
title_sort k-mer spectrum-based error correction algorithm for next-generation sequencing data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9303089/
https://www.ncbi.nlm.nih.gov/pubmed/35875730
http://dx.doi.org/10.1155/2022/8077664
work_keys_str_mv AT aleisahussahn kmerspectrumbasederrorcorrectionalgorithmfornextgenerationsequencingdata
AT hamadsafwat kmerspectrumbasederrorcorrectionalgorithmfornextgenerationsequencingdata
AT elhadadahmed kmerspectrumbasederrorcorrectionalgorithmfornextgenerationsequencingdata