Cargando…

Post-Alignment Adjustment and Its Automation

Multiple sequence alignment (MSA) is the basis for almost all sequence comparison and molecular phylogenetic inferences. Large-scale genomic analyses are typically associated with automated progressive MSA without subsequent manual adjustment, which itself is often error-prone because of the lack of...

Descripción completa

Detalles Bibliográficos
Autor principal: Xia, Xuhua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8623120/
https://www.ncbi.nlm.nih.gov/pubmed/34828415
http://dx.doi.org/10.3390/genes12111809
_version_ 1784605855666143232
author Xia, Xuhua
author_facet Xia, Xuhua
author_sort Xia, Xuhua
collection PubMed
description Multiple sequence alignment (MSA) is the basis for almost all sequence comparison and molecular phylogenetic inferences. Large-scale genomic analyses are typically associated with automated progressive MSA without subsequent manual adjustment, which itself is often error-prone because of the lack of a consistent and explicit criterion. Here, I outlined several commonly encountered alignment errors that cannot be avoided by progressive MSA for nucleotide, amino acid, and codon sequences. Methods that could be automated to fix such alignment errors were then presented. I emphasized the utility of position weight matrix as a new tool for MSA refinement and illustrated its usage by refining the MSA of nucleotide and amino acid sequences. The main advantages of the position weight matrix approach include (1) its use of information from all sequences, in contrast to other commonly used methods based on pairwise alignment scores and inconsistency measures, and (2) its speedy computation, making it suitable for a large number of long viral genomic sequences.
format Online
Article
Text
id pubmed-8623120
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-86231202021-11-27 Post-Alignment Adjustment and Its Automation Xia, Xuhua Genes (Basel) Article Multiple sequence alignment (MSA) is the basis for almost all sequence comparison and molecular phylogenetic inferences. Large-scale genomic analyses are typically associated with automated progressive MSA without subsequent manual adjustment, which itself is often error-prone because of the lack of a consistent and explicit criterion. Here, I outlined several commonly encountered alignment errors that cannot be avoided by progressive MSA for nucleotide, amino acid, and codon sequences. Methods that could be automated to fix such alignment errors were then presented. I emphasized the utility of position weight matrix as a new tool for MSA refinement and illustrated its usage by refining the MSA of nucleotide and amino acid sequences. The main advantages of the position weight matrix approach include (1) its use of information from all sequences, in contrast to other commonly used methods based on pairwise alignment scores and inconsistency measures, and (2) its speedy computation, making it suitable for a large number of long viral genomic sequences. MDPI 2021-11-18 /pmc/articles/PMC8623120/ /pubmed/34828415 http://dx.doi.org/10.3390/genes12111809 Text en © 2021 by the author. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Xia, Xuhua
Post-Alignment Adjustment and Its Automation
title Post-Alignment Adjustment and Its Automation
title_full Post-Alignment Adjustment and Its Automation
title_fullStr Post-Alignment Adjustment and Its Automation
title_full_unstemmed Post-Alignment Adjustment and Its Automation
title_short Post-Alignment Adjustment and Its Automation
title_sort post-alignment adjustment and its automation
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8623120/
https://www.ncbi.nlm.nih.gov/pubmed/34828415
http://dx.doi.org/10.3390/genes12111809
work_keys_str_mv AT xiaxuhua postalignmentadjustmentanditsautomation