Cargando…

Revisiting gap locations in amino acid sequence alignments and a proposal for a method to improve them by introducing solvent accessibility

In comparative modeling, the quality of amino acid sequence alignment still constitutes a major bottleneck in the generation of high quality models of protein three-dimensional (3D) structures. Substantial efforts have been made to improve alignment quality by revising the substitution matrix, intro...

Descripción completa

Detalles Bibliográficos
Autores principales: Hijikata, Atsushi, Yura, Kei, Noguti, Tosiyuki, Go, Mitiko
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Wiley Subscription Services, Inc., A Wiley Company 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3110861/
https://www.ncbi.nlm.nih.gov/pubmed/21465562
http://dx.doi.org/10.1002/prot.23011
_version_ 1782205563752415232
author Hijikata, Atsushi
Yura, Kei
Noguti, Tosiyuki
Go, Mitiko
author_facet Hijikata, Atsushi
Yura, Kei
Noguti, Tosiyuki
Go, Mitiko
author_sort Hijikata, Atsushi
collection PubMed
description In comparative modeling, the quality of amino acid sequence alignment still constitutes a major bottleneck in the generation of high quality models of protein three-dimensional (3D) structures. Substantial efforts have been made to improve alignment quality by revising the substitution matrix, introducing multiple sequences, replacing dynamic programming with hidden Markov models, and incorporating 3D structure information. Improvements in the gap penalty have not been a major focus, however, following the development of the affine gap penalty and of the secondary structure dependent gap penalty. We revisited the correlation between protein 3D structure and gap location in a large protein 3D structure data set, and found that the frequency of gap locations approximated to an exponential function of the solvent accessibility of the inserted residues. The nonlinearity of the gap frequency as a function of accessibility corresponded well to the relationship between residue mutation pattern and residue accessibility. By introducing this relationship into the gap penalty calculation for pairwise alignment between template and target amino acid sequences, we were able to obtain a sequence alignment much closer to the structural alignment. The quality of the alignments was substantially improved on a pair of sequences with identity in the “twilight zone” between 20 and 40%. The relocation of gaps by our new method made a significant improvement in comparative modeling, exemplified here by the Bacillus subtilis yitF protein. The method was implemented in a computer program, ALAdeGAP (ALignment with Accessibility dependent GAp Penalty), which is available at http://cib.cf.ocha.ac.jp/target_protein/. Proteins 2011; © 2011 Wiley-Liss, Inc.
format Online
Article
Text
id pubmed-3110861
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Wiley Subscription Services, Inc., A Wiley Company
record_format MEDLINE/PubMed
spelling pubmed-31108612011-06-14 Revisiting gap locations in amino acid sequence alignments and a proposal for a method to improve them by introducing solvent accessibility Hijikata, Atsushi Yura, Kei Noguti, Tosiyuki Go, Mitiko Proteins Research Article In comparative modeling, the quality of amino acid sequence alignment still constitutes a major bottleneck in the generation of high quality models of protein three-dimensional (3D) structures. Substantial efforts have been made to improve alignment quality by revising the substitution matrix, introducing multiple sequences, replacing dynamic programming with hidden Markov models, and incorporating 3D structure information. Improvements in the gap penalty have not been a major focus, however, following the development of the affine gap penalty and of the secondary structure dependent gap penalty. We revisited the correlation between protein 3D structure and gap location in a large protein 3D structure data set, and found that the frequency of gap locations approximated to an exponential function of the solvent accessibility of the inserted residues. The nonlinearity of the gap frequency as a function of accessibility corresponded well to the relationship between residue mutation pattern and residue accessibility. By introducing this relationship into the gap penalty calculation for pairwise alignment between template and target amino acid sequences, we were able to obtain a sequence alignment much closer to the structural alignment. The quality of the alignments was substantially improved on a pair of sequences with identity in the “twilight zone” between 20 and 40%. The relocation of gaps by our new method made a significant improvement in comparative modeling, exemplified here by the Bacillus subtilis yitF protein. The method was implemented in a computer program, ALAdeGAP (ALignment with Accessibility dependent GAp Penalty), which is available at http://cib.cf.ocha.ac.jp/target_protein/. Proteins 2011; © 2011 Wiley-Liss, Inc. Wiley Subscription Services, Inc., A Wiley Company 2011-06 2011-02-10 /pmc/articles/PMC3110861/ /pubmed/21465562 http://dx.doi.org/10.1002/prot.23011 Text en Copyright © 2011 Wiley-Liss, Inc., A Wiley Company http://creativecommons.org/licenses/by/2.5/ Re-use of this article is permitted in accordance with the Creative Commons Deed, Attribution 2.5, which does not permit commercial exploitation.
spellingShingle Research Article
Hijikata, Atsushi
Yura, Kei
Noguti, Tosiyuki
Go, Mitiko
Revisiting gap locations in amino acid sequence alignments and a proposal for a method to improve them by introducing solvent accessibility
title Revisiting gap locations in amino acid sequence alignments and a proposal for a method to improve them by introducing solvent accessibility
title_full Revisiting gap locations in amino acid sequence alignments and a proposal for a method to improve them by introducing solvent accessibility
title_fullStr Revisiting gap locations in amino acid sequence alignments and a proposal for a method to improve them by introducing solvent accessibility
title_full_unstemmed Revisiting gap locations in amino acid sequence alignments and a proposal for a method to improve them by introducing solvent accessibility
title_short Revisiting gap locations in amino acid sequence alignments and a proposal for a method to improve them by introducing solvent accessibility
title_sort revisiting gap locations in amino acid sequence alignments and a proposal for a method to improve them by introducing solvent accessibility
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3110861/
https://www.ncbi.nlm.nih.gov/pubmed/21465562
http://dx.doi.org/10.1002/prot.23011
work_keys_str_mv AT hijikataatsushi revisitinggaplocationsinaminoacidsequencealignmentsandaproposalforamethodtoimprovethembyintroducingsolventaccessibility
AT yurakei revisitinggaplocationsinaminoacidsequencealignmentsandaproposalforamethodtoimprovethembyintroducingsolventaccessibility
AT nogutitosiyuki revisitinggaplocationsinaminoacidsequencealignmentsandaproposalforamethodtoimprovethembyintroducingsolventaccessibility
AT gomitiko revisitinggaplocationsinaminoacidsequencealignmentsandaproposalforamethodtoimprovethembyintroducingsolventaccessibility