Cargando…
Amino acid substitution matrices from an information theoretic perspective
Protein sequence alignments have become an important tool for molecular biologists. Local alignments are frequently constructed with the aid of a “substitution score matrix” that specifies a score for aligning each pair of amino acid residues. Over the years, many different substitution matrices hav...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Published by Elsevier Ltd.
1991
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7130686/ https://www.ncbi.nlm.nih.gov/pubmed/2051488 http://dx.doi.org/10.1016/0022-2836(91)90193-A |
_version_ | 1783517066909712384 |
---|---|
author | Altschul, Stephen F. |
author_facet | Altschul, Stephen F. |
author_sort | Altschul, Stephen F. |
collection | PubMed |
description | Protein sequence alignments have become an important tool for molecular biologists. Local alignments are frequently constructed with the aid of a “substitution score matrix” that specifies a score for aligning each pair of amino acid residues. Over the years, many different substitution matrices have been proposed, based on a wide variety of rationales. Statistical results, however, demonstrate that any such matrix is implicitly a “log-odds” matrix, with a specific target distribution for aligned pairs of amino acid residues. In the light of information theory, it is possible to express the scores of a substitution matrix in bits and to see that different matrices are better adapted to different purposes. The most widely used matrix for protein sequence comparison has been the PAM-250 matrix. It is argued that for database searches the PAM-120 matrix generally is more appropriate, while for comparing two specific proteins with suspected homology the PAM-200 matrix is indicated. Examples discussed include the lipocalins, human α(1)B-glycoprotein, the cystic fibrosis transmembrane conductance regulator and the globins. |
format | Online Article Text |
id | pubmed-7130686 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 1991 |
publisher | Published by Elsevier Ltd. |
record_format | MEDLINE/PubMed |
spelling | pubmed-71306862020-04-08 Amino acid substitution matrices from an information theoretic perspective Altschul, Stephen F. J Mol Biol Article Protein sequence alignments have become an important tool for molecular biologists. Local alignments are frequently constructed with the aid of a “substitution score matrix” that specifies a score for aligning each pair of amino acid residues. Over the years, many different substitution matrices have been proposed, based on a wide variety of rationales. Statistical results, however, demonstrate that any such matrix is implicitly a “log-odds” matrix, with a specific target distribution for aligned pairs of amino acid residues. In the light of information theory, it is possible to express the scores of a substitution matrix in bits and to see that different matrices are better adapted to different purposes. The most widely used matrix for protein sequence comparison has been the PAM-250 matrix. It is argued that for database searches the PAM-120 matrix generally is more appropriate, while for comparing two specific proteins with suspected homology the PAM-200 matrix is indicated. Examples discussed include the lipocalins, human α(1)B-glycoprotein, the cystic fibrosis transmembrane conductance regulator and the globins. Published by Elsevier Ltd. 1991-06-05 2005-03-09 /pmc/articles/PMC7130686/ /pubmed/2051488 http://dx.doi.org/10.1016/0022-2836(91)90193-A Text en Copyright © 1991 Published by Elsevier Ltd. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active. |
spellingShingle | Article Altschul, Stephen F. Amino acid substitution matrices from an information theoretic perspective |
title | Amino acid substitution matrices from an information theoretic perspective |
title_full | Amino acid substitution matrices from an information theoretic perspective |
title_fullStr | Amino acid substitution matrices from an information theoretic perspective |
title_full_unstemmed | Amino acid substitution matrices from an information theoretic perspective |
title_short | Amino acid substitution matrices from an information theoretic perspective |
title_sort | amino acid substitution matrices from an information theoretic perspective |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7130686/ https://www.ncbi.nlm.nih.gov/pubmed/2051488 http://dx.doi.org/10.1016/0022-2836(91)90193-A |
work_keys_str_mv | AT altschulstephenf aminoacidsubstitutionmatricesfromaninformationtheoreticperspective |