Cargando…

Amino acid substitution matrices from an information theoretic perspective

Protein sequence alignments have become an important tool for molecular biologists. Local alignments are frequently constructed with the aid of a “substitution score matrix” that specifies a score for aligning each pair of amino acid residues. Over the years, many different substitution matrices hav...

Descripción completa

Detalles Bibliográficos
Autor principal: Altschul, Stephen F.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Published by Elsevier Ltd. 1991
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7130686/
https://www.ncbi.nlm.nih.gov/pubmed/2051488
http://dx.doi.org/10.1016/0022-2836(91)90193-A
_version_ 1783517066909712384
author Altschul, Stephen F.
author_facet Altschul, Stephen F.
author_sort Altschul, Stephen F.
collection PubMed
description Protein sequence alignments have become an important tool for molecular biologists. Local alignments are frequently constructed with the aid of a “substitution score matrix” that specifies a score for aligning each pair of amino acid residues. Over the years, many different substitution matrices have been proposed, based on a wide variety of rationales. Statistical results, however, demonstrate that any such matrix is implicitly a “log-odds” matrix, with a specific target distribution for aligned pairs of amino acid residues. In the light of information theory, it is possible to express the scores of a substitution matrix in bits and to see that different matrices are better adapted to different purposes. The most widely used matrix for protein sequence comparison has been the PAM-250 matrix. It is argued that for database searches the PAM-120 matrix generally is more appropriate, while for comparing two specific proteins with suspected homology the PAM-200 matrix is indicated. Examples discussed include the lipocalins, human α(1)B-glycoprotein, the cystic fibrosis transmembrane conductance regulator and the globins.
format Online
Article
Text
id pubmed-7130686
institution National Center for Biotechnology Information
language English
publishDate 1991
publisher Published by Elsevier Ltd.
record_format MEDLINE/PubMed
spelling pubmed-71306862020-04-08 Amino acid substitution matrices from an information theoretic perspective Altschul, Stephen F. J Mol Biol Article Protein sequence alignments have become an important tool for molecular biologists. Local alignments are frequently constructed with the aid of a “substitution score matrix” that specifies a score for aligning each pair of amino acid residues. Over the years, many different substitution matrices have been proposed, based on a wide variety of rationales. Statistical results, however, demonstrate that any such matrix is implicitly a “log-odds” matrix, with a specific target distribution for aligned pairs of amino acid residues. In the light of information theory, it is possible to express the scores of a substitution matrix in bits and to see that different matrices are better adapted to different purposes. The most widely used matrix for protein sequence comparison has been the PAM-250 matrix. It is argued that for database searches the PAM-120 matrix generally is more appropriate, while for comparing two specific proteins with suspected homology the PAM-200 matrix is indicated. Examples discussed include the lipocalins, human α(1)B-glycoprotein, the cystic fibrosis transmembrane conductance regulator and the globins. Published by Elsevier Ltd. 1991-06-05 2005-03-09 /pmc/articles/PMC7130686/ /pubmed/2051488 http://dx.doi.org/10.1016/0022-2836(91)90193-A Text en Copyright © 1991 Published by Elsevier Ltd. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle Article
Altschul, Stephen F.
Amino acid substitution matrices from an information theoretic perspective
title Amino acid substitution matrices from an information theoretic perspective
title_full Amino acid substitution matrices from an information theoretic perspective
title_fullStr Amino acid substitution matrices from an information theoretic perspective
title_full_unstemmed Amino acid substitution matrices from an information theoretic perspective
title_short Amino acid substitution matrices from an information theoretic perspective
title_sort amino acid substitution matrices from an information theoretic perspective
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7130686/
https://www.ncbi.nlm.nih.gov/pubmed/2051488
http://dx.doi.org/10.1016/0022-2836(91)90193-A
work_keys_str_mv AT altschulstephenf aminoacidsubstitutionmatricesfromaninformationtheoreticperspective