Cargando…

Powerful Sequence Similarity Search Methods and In-Depth Manual Analyses Can Identify Remote Homologs in Many Apparently “Orphan” Viral Proteins

The genome sequences of new viruses often contain many “orphan” or “taxon-specific” proteins apparently lacking homologs. However, because viral proteins evolve very fast, commonly used sequence similarity detection methods such as BLAST may overlook homologs. We analyzed a data set of proteins from...

Descripción completa

Detalles Bibliográficos
Autores principales: Kuchibhatla, Durga B., Sherman, Westley A., Chung, Betty Y. W., Cook, Shelley, Schneider, Georg, Eisenhaber, Birgit, Karlin, David G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Microbiology 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3911697/
https://www.ncbi.nlm.nih.gov/pubmed/24155369
http://dx.doi.org/10.1128/JVI.02595-13
_version_ 1782302000613949440
author Kuchibhatla, Durga B.
Sherman, Westley A.
Chung, Betty Y. W.
Cook, Shelley
Schneider, Georg
Eisenhaber, Birgit
Karlin, David G.
author_facet Kuchibhatla, Durga B.
Sherman, Westley A.
Chung, Betty Y. W.
Cook, Shelley
Schneider, Georg
Eisenhaber, Birgit
Karlin, David G.
author_sort Kuchibhatla, Durga B.
collection PubMed
description The genome sequences of new viruses often contain many “orphan” or “taxon-specific” proteins apparently lacking homologs. However, because viral proteins evolve very fast, commonly used sequence similarity detection methods such as BLAST may overlook homologs. We analyzed a data set of proteins from RNA viruses characterized as “genus specific” by BLAST. More powerful methods developed recently, such as HHblits or HHpred (available through web-based, user-friendly interfaces), could detect distant homologs of a quarter of these proteins, suggesting that these methods should be used to annotate viral genomes. In-depth manual analyses of a subset of the remaining sequences, guided by contextual information such as taxonomy, gene order, or domain cooccurrence, identified distant homologs of another third. Thus, a combination of powerful automated methods and manual analyses can uncover distant homologs of many proteins thought to be orphans. We expect these methodological results to be also applicable to cellular organisms, since they generally evolve much more slowly than RNA viruses. As an application, we reanalyzed the genome of a bee pathogen, Chronic bee paralysis virus (CBPV). We could identify homologs of most of its proteins thought to be orphans; in each case, identifying homologs provided functional clues. We discovered that CBPV encodes a domain homologous to the Alphavirus methyltransferase-guanylyltransferase; a putative membrane protein, SP24, with homologs in unrelated insect viruses and insect-transmitted plant viruses having different morphologies (cileviruses, higreviruses, blunerviruses, negeviruses); and a putative virion glycoprotein, ORF2, also found in negeviruses. SP24 and ORF2 are probably major structural components of the virions.
format Online
Article
Text
id pubmed-3911697
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher American Society for Microbiology
record_format MEDLINE/PubMed
spelling pubmed-39116972014-02-05 Powerful Sequence Similarity Search Methods and In-Depth Manual Analyses Can Identify Remote Homologs in Many Apparently “Orphan” Viral Proteins Kuchibhatla, Durga B. Sherman, Westley A. Chung, Betty Y. W. Cook, Shelley Schneider, Georg Eisenhaber, Birgit Karlin, David G. J Virol Genetic Diversity and Evolution The genome sequences of new viruses often contain many “orphan” or “taxon-specific” proteins apparently lacking homologs. However, because viral proteins evolve very fast, commonly used sequence similarity detection methods such as BLAST may overlook homologs. We analyzed a data set of proteins from RNA viruses characterized as “genus specific” by BLAST. More powerful methods developed recently, such as HHblits or HHpred (available through web-based, user-friendly interfaces), could detect distant homologs of a quarter of these proteins, suggesting that these methods should be used to annotate viral genomes. In-depth manual analyses of a subset of the remaining sequences, guided by contextual information such as taxonomy, gene order, or domain cooccurrence, identified distant homologs of another third. Thus, a combination of powerful automated methods and manual analyses can uncover distant homologs of many proteins thought to be orphans. We expect these methodological results to be also applicable to cellular organisms, since they generally evolve much more slowly than RNA viruses. As an application, we reanalyzed the genome of a bee pathogen, Chronic bee paralysis virus (CBPV). We could identify homologs of most of its proteins thought to be orphans; in each case, identifying homologs provided functional clues. We discovered that CBPV encodes a domain homologous to the Alphavirus methyltransferase-guanylyltransferase; a putative membrane protein, SP24, with homologs in unrelated insect viruses and insect-transmitted plant viruses having different morphologies (cileviruses, higreviruses, blunerviruses, negeviruses); and a putative virion glycoprotein, ORF2, also found in negeviruses. SP24 and ORF2 are probably major structural components of the virions. American Society for Microbiology 2014-01 /pmc/articles/PMC3911697/ /pubmed/24155369 http://dx.doi.org/10.1128/JVI.02595-13 Text en Copyright © 2014 Kuchibhatla et al. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 Unported license (http://creativecommons.org/licenses/by/3.0/) .
spellingShingle Genetic Diversity and Evolution
Kuchibhatla, Durga B.
Sherman, Westley A.
Chung, Betty Y. W.
Cook, Shelley
Schneider, Georg
Eisenhaber, Birgit
Karlin, David G.
Powerful Sequence Similarity Search Methods and In-Depth Manual Analyses Can Identify Remote Homologs in Many Apparently “Orphan” Viral Proteins
title Powerful Sequence Similarity Search Methods and In-Depth Manual Analyses Can Identify Remote Homologs in Many Apparently “Orphan” Viral Proteins
title_full Powerful Sequence Similarity Search Methods and In-Depth Manual Analyses Can Identify Remote Homologs in Many Apparently “Orphan” Viral Proteins
title_fullStr Powerful Sequence Similarity Search Methods and In-Depth Manual Analyses Can Identify Remote Homologs in Many Apparently “Orphan” Viral Proteins
title_full_unstemmed Powerful Sequence Similarity Search Methods and In-Depth Manual Analyses Can Identify Remote Homologs in Many Apparently “Orphan” Viral Proteins
title_short Powerful Sequence Similarity Search Methods and In-Depth Manual Analyses Can Identify Remote Homologs in Many Apparently “Orphan” Viral Proteins
title_sort powerful sequence similarity search methods and in-depth manual analyses can identify remote homologs in many apparently “orphan” viral proteins
topic Genetic Diversity and Evolution
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3911697/
https://www.ncbi.nlm.nih.gov/pubmed/24155369
http://dx.doi.org/10.1128/JVI.02595-13
work_keys_str_mv AT kuchibhatladurgab powerfulsequencesimilaritysearchmethodsandindepthmanualanalysescanidentifyremotehomologsinmanyapparentlyorphanviralproteins
AT shermanwestleya powerfulsequencesimilaritysearchmethodsandindepthmanualanalysescanidentifyremotehomologsinmanyapparentlyorphanviralproteins
AT chungbettyyw powerfulsequencesimilaritysearchmethodsandindepthmanualanalysescanidentifyremotehomologsinmanyapparentlyorphanviralproteins
AT cookshelley powerfulsequencesimilaritysearchmethodsandindepthmanualanalysescanidentifyremotehomologsinmanyapparentlyorphanviralproteins
AT schneidergeorg powerfulsequencesimilaritysearchmethodsandindepthmanualanalysescanidentifyremotehomologsinmanyapparentlyorphanviralproteins
AT eisenhaberbirgit powerfulsequencesimilaritysearchmethodsandindepthmanualanalysescanidentifyremotehomologsinmanyapparentlyorphanviralproteins
AT karlindavidg powerfulsequencesimilaritysearchmethodsandindepthmanualanalysescanidentifyremotehomologsinmanyapparentlyorphanviralproteins