Cargando…

ppiTrim: constructing non-redundant and up-to-date interactomes

Robust advances in interactome analysis demand comprehensive, non-redundant and consistently annotated data sets. By non-redundant, we mean that the accounting of evidence for every interaction should be faithful: each independent experimental support is counted exactly once, no more, no less. While...

Descripción completa

Detalles Bibliográficos
Autores principales: Stojmirović, Aleksandar, Yu, Yi-Kuo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3162744/
https://www.ncbi.nlm.nih.gov/pubmed/21873645
http://dx.doi.org/10.1093/database/bar036
_version_ 1782210860545998848
author Stojmirović, Aleksandar
Yu, Yi-Kuo
author_facet Stojmirović, Aleksandar
Yu, Yi-Kuo
author_sort Stojmirović, Aleksandar
collection PubMed
description Robust advances in interactome analysis demand comprehensive, non-redundant and consistently annotated data sets. By non-redundant, we mean that the accounting of evidence for every interaction should be faithful: each independent experimental support is counted exactly once, no more, no less. While many interactions are shared among public repositories, none of them contains the complete known interactome for any model organism. In addition, the annotations of the same experimental result by different repositories often disagree. This brings up the issue of which annotation to keep while consolidating evidences that are the same. The iRefIndex database, including interactions from most popular repositories with a standardized protein nomenclature, represents a significant advance in all aspects, especially in comprehensiveness. However, iRefIndex aims to maintain all information/annotation from original sources and requires users to perform additional processing to fully achieve the aforementioned goals. Another issue has to do with protein complexes. Some databases represent experimentally observed complexes as interactions with more than two participants, while others expand them into binary interactions using spoke or matrix model. To avoid untested interaction information buildup, it is preferable to replace the expanded protein complexes, either from spoke or matrix models, with a flat list of complex members. To address these issues and to achieve our goals, we have developed ppiTrim, a script that processes iRefIndex to produce non-redundant, consistently annotated data sets of physical interactions. Our script proceeds in three stages: mapping all interactants to gene identifiers and removing all undesired raw interactions, deflating potentially expanded complexes, and reconciling for each interaction the annotation labels among different source databases. As an illustration, we have processed the three largest organismal data sets: yeast, human and fruitfly. While ppiTrim can resolve most apparent conflicts between different labelings, we also discovered some unresolvable disagreements mostly resulting from different annotation policies among repositories. Database URL: http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads/ppiTrim.html
format Online
Article
Text
id pubmed-3162744
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-31627442011-08-29 ppiTrim: constructing non-redundant and up-to-date interactomes Stojmirović, Aleksandar Yu, Yi-Kuo Database (Oxford) Database Tool Robust advances in interactome analysis demand comprehensive, non-redundant and consistently annotated data sets. By non-redundant, we mean that the accounting of evidence for every interaction should be faithful: each independent experimental support is counted exactly once, no more, no less. While many interactions are shared among public repositories, none of them contains the complete known interactome for any model organism. In addition, the annotations of the same experimental result by different repositories often disagree. This brings up the issue of which annotation to keep while consolidating evidences that are the same. The iRefIndex database, including interactions from most popular repositories with a standardized protein nomenclature, represents a significant advance in all aspects, especially in comprehensiveness. However, iRefIndex aims to maintain all information/annotation from original sources and requires users to perform additional processing to fully achieve the aforementioned goals. Another issue has to do with protein complexes. Some databases represent experimentally observed complexes as interactions with more than two participants, while others expand them into binary interactions using spoke or matrix model. To avoid untested interaction information buildup, it is preferable to replace the expanded protein complexes, either from spoke or matrix models, with a flat list of complex members. To address these issues and to achieve our goals, we have developed ppiTrim, a script that processes iRefIndex to produce non-redundant, consistently annotated data sets of physical interactions. Our script proceeds in three stages: mapping all interactants to gene identifiers and removing all undesired raw interactions, deflating potentially expanded complexes, and reconciling for each interaction the annotation labels among different source databases. As an illustration, we have processed the three largest organismal data sets: yeast, human and fruitfly. While ppiTrim can resolve most apparent conflicts between different labelings, we also discovered some unresolvable disagreements mostly resulting from different annotation policies among repositories. Database URL: http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads/ppiTrim.html Oxford University Press 2011-08-27 /pmc/articles/PMC3162744/ /pubmed/21873645 http://dx.doi.org/10.1093/database/bar036 Text en Published by Oxford University Press 2011. http://creativecommons.org/licenses/by-nc/2.5 This is Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Database Tool
Stojmirović, Aleksandar
Yu, Yi-Kuo
ppiTrim: constructing non-redundant and up-to-date interactomes
title ppiTrim: constructing non-redundant and up-to-date interactomes
title_full ppiTrim: constructing non-redundant and up-to-date interactomes
title_fullStr ppiTrim: constructing non-redundant and up-to-date interactomes
title_full_unstemmed ppiTrim: constructing non-redundant and up-to-date interactomes
title_short ppiTrim: constructing non-redundant and up-to-date interactomes
title_sort ppitrim: constructing non-redundant and up-to-date interactomes
topic Database Tool
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3162744/
https://www.ncbi.nlm.nih.gov/pubmed/21873645
http://dx.doi.org/10.1093/database/bar036
work_keys_str_mv AT stojmirovicaleksandar ppitrimconstructingnonredundantanduptodateinteractomes
AT yuyikuo ppitrimconstructingnonredundantanduptodateinteractomes