Cargando…

FLU, an amino acid substitution model for influenza proteins

BACKGROUND: The amino acid substitution model is the core component of many protein analysis systems such as sequence similarity search, sequence alignment, and phylogenetic inference. Although several general amino acid substitution models have been estimated from large and diverse protein database...

Descripción completa

Detalles Bibliográficos
Autores principales: Dang, Cuong Cao, Le, Quang Si, Gascuel, Olivier, Le, Vinh Sy
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2873421/
https://www.ncbi.nlm.nih.gov/pubmed/20384985
http://dx.doi.org/10.1186/1471-2148-10-99
_version_ 1782181336483627008
author Dang, Cuong Cao
Le, Quang Si
Gascuel, Olivier
Le, Vinh Sy
author_facet Dang, Cuong Cao
Le, Quang Si
Gascuel, Olivier
Le, Vinh Sy
author_sort Dang, Cuong Cao
collection PubMed
description BACKGROUND: The amino acid substitution model is the core component of many protein analysis systems such as sequence similarity search, sequence alignment, and phylogenetic inference. Although several general amino acid substitution models have been estimated from large and diverse protein databases, they remain inappropriate for analyzing specific species, e.g., viruses. Emerging epidemics of influenza viruses raise the need for comprehensive studies of these dangerous viruses. We propose an influenza-specific amino acid substitution model to enhance the understanding of the evolution of influenza viruses. RESULTS: A maximum likelihood approach was applied to estimate an amino acid substitution model (FLU) from ~113, 000 influenza protein sequences, consisting of ~20 million residues. FLU outperforms 14 widely used models in constructing maximum likelihood phylogenetic trees for the majority of influenza protein alignments. On average, FLU gains ~42 log likelihood points with an alignment of 300 sites. Moreover, topologies of trees constructed using FLU and other models are frequently different. FLU does indeed have an impact on likelihood improvement as well as tree topologies. It was implemented in PhyML and can be downloaded from ftp://ftp.sanger.ac.uk/pub/1000genomes/lsq/FLU or included in PhyML 3.0 server at http://www.atgc-montpellier.fr/phyml/. CONCLUSIONS: FLU should be useful for any influenza protein analysis system which requires an accurate description of amino acid substitutions.
format Text
id pubmed-2873421
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-28734212010-05-20 FLU, an amino acid substitution model for influenza proteins Dang, Cuong Cao Le, Quang Si Gascuel, Olivier Le, Vinh Sy BMC Evol Biol Research article BACKGROUND: The amino acid substitution model is the core component of many protein analysis systems such as sequence similarity search, sequence alignment, and phylogenetic inference. Although several general amino acid substitution models have been estimated from large and diverse protein databases, they remain inappropriate for analyzing specific species, e.g., viruses. Emerging epidemics of influenza viruses raise the need for comprehensive studies of these dangerous viruses. We propose an influenza-specific amino acid substitution model to enhance the understanding of the evolution of influenza viruses. RESULTS: A maximum likelihood approach was applied to estimate an amino acid substitution model (FLU) from ~113, 000 influenza protein sequences, consisting of ~20 million residues. FLU outperforms 14 widely used models in constructing maximum likelihood phylogenetic trees for the majority of influenza protein alignments. On average, FLU gains ~42 log likelihood points with an alignment of 300 sites. Moreover, topologies of trees constructed using FLU and other models are frequently different. FLU does indeed have an impact on likelihood improvement as well as tree topologies. It was implemented in PhyML and can be downloaded from ftp://ftp.sanger.ac.uk/pub/1000genomes/lsq/FLU or included in PhyML 3.0 server at http://www.atgc-montpellier.fr/phyml/. CONCLUSIONS: FLU should be useful for any influenza protein analysis system which requires an accurate description of amino acid substitutions. BioMed Central 2010-04-12 /pmc/articles/PMC2873421/ /pubmed/20384985 http://dx.doi.org/10.1186/1471-2148-10-99 Text en Copyright ©2010 Dang et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research article
Dang, Cuong Cao
Le, Quang Si
Gascuel, Olivier
Le, Vinh Sy
FLU, an amino acid substitution model for influenza proteins
title FLU, an amino acid substitution model for influenza proteins
title_full FLU, an amino acid substitution model for influenza proteins
title_fullStr FLU, an amino acid substitution model for influenza proteins
title_full_unstemmed FLU, an amino acid substitution model for influenza proteins
title_short FLU, an amino acid substitution model for influenza proteins
title_sort flu, an amino acid substitution model for influenza proteins
topic Research article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2873421/
https://www.ncbi.nlm.nih.gov/pubmed/20384985
http://dx.doi.org/10.1186/1471-2148-10-99
work_keys_str_mv AT dangcuongcao fluanaminoacidsubstitutionmodelforinfluenzaproteins
AT lequangsi fluanaminoacidsubstitutionmodelforinfluenzaproteins
AT gascuelolivier fluanaminoacidsubstitutionmodelforinfluenzaproteins
AT levinhsy fluanaminoacidsubstitutionmodelforinfluenzaproteins