Cargando…

D-graph clusters flaviviruses and β-coronaviruses according to their hosts, disease type and human cell receptors

MOTIVATION: There is a need for rapid and easy to use, alignment free methods to cluster large groups of protein sequence data. Commonly used phylogenetic trees based on alignments can be used to visualize only a limited number of protein sequences. DGraph, introduced here, is a dynamic programming...

Descripción completa

Detalles Bibliográficos
Autores principales: Braun, Benjamin A., Schein, Catherine H., Braun, Werner
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7430575/
https://www.ncbi.nlm.nih.gov/pubmed/32817945
http://dx.doi.org/10.1101/2020.08.13.249649
_version_ 1783571446546563072
author Braun, Benjamin A.
Schein, Catherine H.
Braun, Werner
author_facet Braun, Benjamin A.
Schein, Catherine H.
Braun, Werner
author_sort Braun, Benjamin A.
collection PubMed
description MOTIVATION: There is a need for rapid and easy to use, alignment free methods to cluster large groups of protein sequence data. Commonly used phylogenetic trees based on alignments can be used to visualize only a limited number of protein sequences. DGraph, introduced here, is a dynamic programming application developed to generate 2D-maps based on similarity scores for sequences. The program automatically calculates and graphically displays property distance (PD) scores based on physico-chemical property (PCP) similarities from an unaligned list of FASTA files. Such “PD-graphs” show the interrelatedness of the sequences, whereby clusters can reveal deeper connectivities. RESULTS: PD-Graphs generated for flavivirus (FV), enterovirus (EV), and coronavirus (CoV) sequences from complete polyproteins or individual proteins are consistent with biological data on vector types, hosts, cellular receptors and disease phenotypes. PD-graphs separate the tick- from the mosquito-borne FV, clusters viruses that infect bats, camels, seabirds and humans separately and the clusters correlate with disease phenotype. The PD method segregates the β-CoV spike proteins of SARS, SARS-CoV-2, and MERS sequences from other human pathogenic CoV, with clustering consistent with cellular receptor usage. The graphs also suggest evolutionary relationships that may be difficult to determine with conventional bootstrapping methods that require postulating an ancestral sequence.
format Online
Article
Text
id pubmed-7430575
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-74305752020-08-18 D-graph clusters flaviviruses and β-coronaviruses according to their hosts, disease type and human cell receptors Braun, Benjamin A. Schein, Catherine H. Braun, Werner bioRxiv Article MOTIVATION: There is a need for rapid and easy to use, alignment free methods to cluster large groups of protein sequence data. Commonly used phylogenetic trees based on alignments can be used to visualize only a limited number of protein sequences. DGraph, introduced here, is a dynamic programming application developed to generate 2D-maps based on similarity scores for sequences. The program automatically calculates and graphically displays property distance (PD) scores based on physico-chemical property (PCP) similarities from an unaligned list of FASTA files. Such “PD-graphs” show the interrelatedness of the sequences, whereby clusters can reveal deeper connectivities. RESULTS: PD-Graphs generated for flavivirus (FV), enterovirus (EV), and coronavirus (CoV) sequences from complete polyproteins or individual proteins are consistent with biological data on vector types, hosts, cellular receptors and disease phenotypes. PD-graphs separate the tick- from the mosquito-borne FV, clusters viruses that infect bats, camels, seabirds and humans separately and the clusters correlate with disease phenotype. The PD method segregates the β-CoV spike proteins of SARS, SARS-CoV-2, and MERS sequences from other human pathogenic CoV, with clustering consistent with cellular receptor usage. The graphs also suggest evolutionary relationships that may be difficult to determine with conventional bootstrapping methods that require postulating an ancestral sequence. Cold Spring Harbor Laboratory 2020-08-14 /pmc/articles/PMC7430575/ /pubmed/32817945 http://dx.doi.org/10.1101/2020.08.13.249649 Text en http://creativecommons.org/licenses/by-nc-nd/4.0/It is made available under a CC-BY-NC-ND 4.0 International license (http://creativecommons.org/licenses/by-nc-nd/4.0/) .
spellingShingle Article
Braun, Benjamin A.
Schein, Catherine H.
Braun, Werner
D-graph clusters flaviviruses and β-coronaviruses according to their hosts, disease type and human cell receptors
title D-graph clusters flaviviruses and β-coronaviruses according to their hosts, disease type and human cell receptors
title_full D-graph clusters flaviviruses and β-coronaviruses according to their hosts, disease type and human cell receptors
title_fullStr D-graph clusters flaviviruses and β-coronaviruses according to their hosts, disease type and human cell receptors
title_full_unstemmed D-graph clusters flaviviruses and β-coronaviruses according to their hosts, disease type and human cell receptors
title_short D-graph clusters flaviviruses and β-coronaviruses according to their hosts, disease type and human cell receptors
title_sort d-graph clusters flaviviruses and β-coronaviruses according to their hosts, disease type and human cell receptors
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7430575/
https://www.ncbi.nlm.nih.gov/pubmed/32817945
http://dx.doi.org/10.1101/2020.08.13.249649
work_keys_str_mv AT braunbenjamina dgraphclustersflavivirusesandbcoronavirusesaccordingtotheirhostsdiseasetypeandhumancellreceptors
AT scheincatherineh dgraphclustersflavivirusesandbcoronavirusesaccordingtotheirhostsdiseasetypeandhumancellreceptors
AT braunwerner dgraphclustersflavivirusesandbcoronavirusesaccordingtotheirhostsdiseasetypeandhumancellreceptors