Cargando…
New evaluation methods of read mapping by 17 aligners on simulated and empirical NGS data: an updated comparison of DNA- and RNA-Seq data from Illumina and Ion Torrent technologies
During the last (15) years, improved omics sequencing technologies have expanded the scale and resolution of various biological applications, generating high-throughput datasets that require carefully chosen software tools to be processed. Therefore, following the sequencing development, bioinformat...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer London
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8208613/ https://www.ncbi.nlm.nih.gov/pubmed/34155424 http://dx.doi.org/10.1007/s00521-021-06188-z |
_version_ | 1783708959640649728 |
---|---|
author | Donato, Luigi Scimone, Concetta Rinaldi, Carmela D’Angelo, Rosalia Sidoti, Antonina |
author_facet | Donato, Luigi Scimone, Concetta Rinaldi, Carmela D’Angelo, Rosalia Sidoti, Antonina |
author_sort | Donato, Luigi |
collection | PubMed |
description | During the last (15) years, improved omics sequencing technologies have expanded the scale and resolution of various biological applications, generating high-throughput datasets that require carefully chosen software tools to be processed. Therefore, following the sequencing development, bioinformatics researchers have been challenged to implement alignment algorithms for next-generation sequencing reads. However, nowadays selection of aligners based on genome characteristics is poorly studied, so our benchmarking study extended the “state of art” comparing 17 different aligners. The chosen tools were assessed on empirical human DNA- and RNA-Seq data, as well as on simulated datasets in human and mouse, evaluating a set of parameters previously not considered in such kind of benchmarks. As expected, we found that each tool was the best in specific conditions. For Ion Torrent single-end RNA-Seq samples, the most suitable aligners were CLC and BWA-MEM, which reached the best results in terms of efficiency, accuracy, duplication rate, saturation profile and running time. About Illumina paired-end osteomyelitis transcriptomics data, instead, the best performer algorithm, together with the already cited CLC, resulted Novoalign, which excelled in accuracy and saturation analyses. Segemehl and DNASTAR performed the best on both DNA-Seq data, with Segemehl particularly suitable for exome data. In conclusion, our study could guide users in the selection of a suitable aligner based on genome and transcriptome characteristics. However, several other aspects, emerged from our work, should be considered in the evolution of alignment research area, such as the involvement of artificial intelligence to support cloud computing and mapping to multiple genomes. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s00521-021-06188-z. |
format | Online Article Text |
id | pubmed-8208613 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Springer London |
record_format | MEDLINE/PubMed |
spelling | pubmed-82086132021-06-17 New evaluation methods of read mapping by 17 aligners on simulated and empirical NGS data: an updated comparison of DNA- and RNA-Seq data from Illumina and Ion Torrent technologies Donato, Luigi Scimone, Concetta Rinaldi, Carmela D’Angelo, Rosalia Sidoti, Antonina Neural Comput Appl Original Article During the last (15) years, improved omics sequencing technologies have expanded the scale and resolution of various biological applications, generating high-throughput datasets that require carefully chosen software tools to be processed. Therefore, following the sequencing development, bioinformatics researchers have been challenged to implement alignment algorithms for next-generation sequencing reads. However, nowadays selection of aligners based on genome characteristics is poorly studied, so our benchmarking study extended the “state of art” comparing 17 different aligners. The chosen tools were assessed on empirical human DNA- and RNA-Seq data, as well as on simulated datasets in human and mouse, evaluating a set of parameters previously not considered in such kind of benchmarks. As expected, we found that each tool was the best in specific conditions. For Ion Torrent single-end RNA-Seq samples, the most suitable aligners were CLC and BWA-MEM, which reached the best results in terms of efficiency, accuracy, duplication rate, saturation profile and running time. About Illumina paired-end osteomyelitis transcriptomics data, instead, the best performer algorithm, together with the already cited CLC, resulted Novoalign, which excelled in accuracy and saturation analyses. Segemehl and DNASTAR performed the best on both DNA-Seq data, with Segemehl particularly suitable for exome data. In conclusion, our study could guide users in the selection of a suitable aligner based on genome and transcriptome characteristics. However, several other aspects, emerged from our work, should be considered in the evolution of alignment research area, such as the involvement of artificial intelligence to support cloud computing and mapping to multiple genomes. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s00521-021-06188-z. Springer London 2021-06-16 2021 /pmc/articles/PMC8208613/ /pubmed/34155424 http://dx.doi.org/10.1007/s00521-021-06188-z Text en © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2021 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Original Article Donato, Luigi Scimone, Concetta Rinaldi, Carmela D’Angelo, Rosalia Sidoti, Antonina New evaluation methods of read mapping by 17 aligners on simulated and empirical NGS data: an updated comparison of DNA- and RNA-Seq data from Illumina and Ion Torrent technologies |
title | New evaluation methods of read mapping by 17 aligners on simulated and empirical NGS data: an updated comparison of DNA- and RNA-Seq data from Illumina and Ion Torrent technologies |
title_full | New evaluation methods of read mapping by 17 aligners on simulated and empirical NGS data: an updated comparison of DNA- and RNA-Seq data from Illumina and Ion Torrent technologies |
title_fullStr | New evaluation methods of read mapping by 17 aligners on simulated and empirical NGS data: an updated comparison of DNA- and RNA-Seq data from Illumina and Ion Torrent technologies |
title_full_unstemmed | New evaluation methods of read mapping by 17 aligners on simulated and empirical NGS data: an updated comparison of DNA- and RNA-Seq data from Illumina and Ion Torrent technologies |
title_short | New evaluation methods of read mapping by 17 aligners on simulated and empirical NGS data: an updated comparison of DNA- and RNA-Seq data from Illumina and Ion Torrent technologies |
title_sort | new evaluation methods of read mapping by 17 aligners on simulated and empirical ngs data: an updated comparison of dna- and rna-seq data from illumina and ion torrent technologies |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8208613/ https://www.ncbi.nlm.nih.gov/pubmed/34155424 http://dx.doi.org/10.1007/s00521-021-06188-z |
work_keys_str_mv | AT donatoluigi newevaluationmethodsofreadmappingby17alignersonsimulatedandempiricalngsdataanupdatedcomparisonofdnaandrnaseqdatafromilluminaandiontorrenttechnologies AT scimoneconcetta newevaluationmethodsofreadmappingby17alignersonsimulatedandempiricalngsdataanupdatedcomparisonofdnaandrnaseqdatafromilluminaandiontorrenttechnologies AT rinaldicarmela newevaluationmethodsofreadmappingby17alignersonsimulatedandempiricalngsdataanupdatedcomparisonofdnaandrnaseqdatafromilluminaandiontorrenttechnologies AT dangelorosalia newevaluationmethodsofreadmappingby17alignersonsimulatedandempiricalngsdataanupdatedcomparisonofdnaandrnaseqdatafromilluminaandiontorrenttechnologies AT sidotiantonina newevaluationmethodsofreadmappingby17alignersonsimulatedandempiricalngsdataanupdatedcomparisonofdnaandrnaseqdatafromilluminaandiontorrenttechnologies |