Cargando…

Functional inference by ProtoNet family tree: the uncharacterized proteome of Daphnia pulex

BACKGROUND: Daphnia pulex (Water flea) is the first fully sequenced crustacean genome. The crustaceans and insects have diverged from a common ancestor. It is a model organism for studying the molecular makeup for coping with the environmental challenges. In the complete proteome, there are 30,550 p...

Descripción completa

Detalles Bibliográficos
Autores principales: Rappoport, Nadav, Linial, Michal
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3584848/
https://www.ncbi.nlm.nih.gov/pubmed/23514195
http://dx.doi.org/10.1186/1471-2105-14-S3-S11
_version_ 1782261070887387136
author Rappoport, Nadav
Linial, Michal
author_facet Rappoport, Nadav
Linial, Michal
author_sort Rappoport, Nadav
collection PubMed
description BACKGROUND: Daphnia pulex (Water flea) is the first fully sequenced crustacean genome. The crustaceans and insects have diverged from a common ancestor. It is a model organism for studying the molecular makeup for coping with the environmental challenges. In the complete proteome, there are 30,550 putative proteins. However, about 10,000 of them have no known homologues. Currently, the UniProtoKB reports on 95% of the Daphnia's proteins as putative and uncharacterized proteins. RESULTS: We have applied ProtoNet, an unsupervised hierarchical protein clustering method that covers about 10 million sequences, for automatic annotation of the Daphnia's proteome. 98.7% (26,625) of the Daphnia full-length proteins were successfully mapped to 13,880 ProtoNet stable clusters, and only 1.3% remained unmapped. We compared the properties of the Daphnia's protein families with those of the mouse and the fruitfly proteomes. Functional annotations were successfully assigned for 86% of the proteins. Most proteins (61%) were mapped to only 2953 clusters that contain Daphnia's duplicated genes. We focused on the functionality of maximally amplified paralogs. Cuticle structure components and a variety of ion channels protein families were associated with a maximal level of gene amplification. We focused on gene amplification as a leading strategy of the Daphnia in coping with environmental toxicity. CONCLUSIONS: Automatic inference is achieved through mapping of sequences to the protein family tree of ProtoNet 6.0. Applying a careful inference protocol resulted in functional assignments for over 86% of the complete proteome. We conclude that the scaffold of ProtoNet can be used as an alignment-free protocol for large-scale annotation task of uncharacterized proteomes.
format Online
Article
Text
id pubmed-3584848
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35848482013-03-11 Functional inference by ProtoNet family tree: the uncharacterized proteome of Daphnia pulex Rappoport, Nadav Linial, Michal BMC Bioinformatics Proceedings BACKGROUND: Daphnia pulex (Water flea) is the first fully sequenced crustacean genome. The crustaceans and insects have diverged from a common ancestor. It is a model organism for studying the molecular makeup for coping with the environmental challenges. In the complete proteome, there are 30,550 putative proteins. However, about 10,000 of them have no known homologues. Currently, the UniProtoKB reports on 95% of the Daphnia's proteins as putative and uncharacterized proteins. RESULTS: We have applied ProtoNet, an unsupervised hierarchical protein clustering method that covers about 10 million sequences, for automatic annotation of the Daphnia's proteome. 98.7% (26,625) of the Daphnia full-length proteins were successfully mapped to 13,880 ProtoNet stable clusters, and only 1.3% remained unmapped. We compared the properties of the Daphnia's protein families with those of the mouse and the fruitfly proteomes. Functional annotations were successfully assigned for 86% of the proteins. Most proteins (61%) were mapped to only 2953 clusters that contain Daphnia's duplicated genes. We focused on the functionality of maximally amplified paralogs. Cuticle structure components and a variety of ion channels protein families were associated with a maximal level of gene amplification. We focused on gene amplification as a leading strategy of the Daphnia in coping with environmental toxicity. CONCLUSIONS: Automatic inference is achieved through mapping of sequences to the protein family tree of ProtoNet 6.0. Applying a careful inference protocol resulted in functional assignments for over 86% of the complete proteome. We conclude that the scaffold of ProtoNet can be used as an alignment-free protocol for large-scale annotation task of uncharacterized proteomes. BioMed Central 2013-02-28 /pmc/articles/PMC3584848/ /pubmed/23514195 http://dx.doi.org/10.1186/1471-2105-14-S3-S11 Text en Copyright ©2013 Rappoport and Linial; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Rappoport, Nadav
Linial, Michal
Functional inference by ProtoNet family tree: the uncharacterized proteome of Daphnia pulex
title Functional inference by ProtoNet family tree: the uncharacterized proteome of Daphnia pulex
title_full Functional inference by ProtoNet family tree: the uncharacterized proteome of Daphnia pulex
title_fullStr Functional inference by ProtoNet family tree: the uncharacterized proteome of Daphnia pulex
title_full_unstemmed Functional inference by ProtoNet family tree: the uncharacterized proteome of Daphnia pulex
title_short Functional inference by ProtoNet family tree: the uncharacterized proteome of Daphnia pulex
title_sort functional inference by protonet family tree: the uncharacterized proteome of daphnia pulex
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3584848/
https://www.ncbi.nlm.nih.gov/pubmed/23514195
http://dx.doi.org/10.1186/1471-2105-14-S3-S11
work_keys_str_mv AT rappoportnadav functionalinferencebyprotonetfamilytreetheuncharacterizedproteomeofdaphniapulex
AT linialmichal functionalinferencebyprotonetfamilytreetheuncharacterizedproteomeofdaphniapulex