Cargando…

The multiple alignments of very short sequences

The multiple sequence alignment (MSA) is an increasingly important task in bioinformatics as we have to deal with the constantly increasing gene‐ and protein sequence databases. MSA is applied in phylogenetic analysis, in discovering conservative protein domains, in the assignment of secondary and t...

Descripción completa

Detalles Bibliográficos
Autores principales: Takács, Kristóf, Grolmusz, Vince
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8255854/
https://www.ncbi.nlm.nih.gov/pubmed/34258521
http://dx.doi.org/10.1096/fba.2020-00118
_version_ 1783717996053659648
author Takács, Kristóf
Grolmusz, Vince
author_facet Takács, Kristóf
Grolmusz, Vince
author_sort Takács, Kristóf
collection PubMed
description The multiple sequence alignment (MSA) is an increasingly important task in bioinformatics as we have to deal with the constantly increasing gene‐ and protein sequence databases. MSA is applied in phylogenetic analysis, in discovering conservative protein domains, in the assignment of secondary and tertiary structural features in proteins, or in the metagenomic sample analysis and gene discovery. Usually, the focus is on the MSA of long sequences, since in the practice these tasks appear most frequently. However, the strict analysis of the optimal MSA of short sequences is an area of negligence, and findings there may contribute to better and faster algorithms for the multiple alignment of long sequences. In the present contribution, we are examining length‐1 sequences using arbitrary metric and length‐2 sequences using unit metric, and we show that the optimum of the MSA problem can be achieved by the trivial alignment in both cases.
format Online
Article
Text
id pubmed-8255854
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-82558542021-07-12 The multiple alignments of very short sequences Takács, Kristóf Grolmusz, Vince FASEB Bioadv Methods The multiple sequence alignment (MSA) is an increasingly important task in bioinformatics as we have to deal with the constantly increasing gene‐ and protein sequence databases. MSA is applied in phylogenetic analysis, in discovering conservative protein domains, in the assignment of secondary and tertiary structural features in proteins, or in the metagenomic sample analysis and gene discovery. Usually, the focus is on the MSA of long sequences, since in the practice these tasks appear most frequently. However, the strict analysis of the optimal MSA of short sequences is an area of negligence, and findings there may contribute to better and faster algorithms for the multiple alignment of long sequences. In the present contribution, we are examining length‐1 sequences using arbitrary metric and length‐2 sequences using unit metric, and we show that the optimum of the MSA problem can be achieved by the trivial alignment in both cases. John Wiley and Sons Inc. 2021-04-29 /pmc/articles/PMC8255854/ /pubmed/34258521 http://dx.doi.org/10.1096/fba.2020-00118 Text en © 2021 The Authors. FASEB BioAdvances published by the Federation of American Societies for Experimental Biology https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods
Takács, Kristóf
Grolmusz, Vince
The multiple alignments of very short sequences
title The multiple alignments of very short sequences
title_full The multiple alignments of very short sequences
title_fullStr The multiple alignments of very short sequences
title_full_unstemmed The multiple alignments of very short sequences
title_short The multiple alignments of very short sequences
title_sort multiple alignments of very short sequences
topic Methods
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8255854/
https://www.ncbi.nlm.nih.gov/pubmed/34258521
http://dx.doi.org/10.1096/fba.2020-00118
work_keys_str_mv AT takacskristof themultiplealignmentsofveryshortsequences
AT grolmuszvince themultiplealignmentsofveryshortsequences
AT takacskristof multiplealignmentsofveryshortsequences
AT grolmuszvince multiplealignmentsofveryshortsequences