Cargando…

Amino acid substitution matrices from an information theoretic perspective

Protein sequence alignments have become an important tool for molecular biologists. Local alignments are frequently constructed with the aid of a “substitution score matrix” that specifies a score for aligning each pair of amino acid residues. Over the years, many different substitution matrices hav...

Descripción completa

Detalles Bibliográficos
Autor principal:	Altschul, Stephen F.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Published by Elsevier Ltd. 1991
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7130686/ https://www.ncbi.nlm.nih.gov/pubmed/2051488 http://dx.doi.org/10.1016/0022-2836(91)90193-A

Descripción
Sumario:	Protein sequence alignments have become an important tool for molecular biologists. Local alignments are frequently constructed with the aid of a “substitution score matrix” that specifies a score for aligning each pair of amino acid residues. Over the years, many different substitution matrices have been proposed, based on a wide variety of rationales. Statistical results, however, demonstrate that any such matrix is implicitly a “log-odds” matrix, with a specific target distribution for aligned pairs of amino acid residues. In the light of information theory, it is possible to express the scores of a substitution matrix in bits and to see that different matrices are better adapted to different purposes. The most widely used matrix for protein sequence comparison has been the PAM-250 matrix. It is argued that for database searches the PAM-120 matrix generally is more appropriate, while for comparing two specific proteins with suspected homology the PAM-200 matrix is indicated. Examples discussed include the lipocalins, human α(1)B-glycoprotein, the cystic fibrosis transmembrane conductance regulator and the globins.

Amino acid substitution matrices from an information theoretic perspective

Ejemplares similares