Cargando…

Discovering collectively informative descriptors from high-throughput experiments

BACKGROUND: Improvements in high-throughput technology and its increasing use have led to the generation of many highly complex datasets that often address similar biological questions. Combining information from these studies can increase the reliability and generalizability of results and also yie...

Descripción completa

Detalles Bibliográficos
Autores principales:	Jeffries, Clark D, Ward, William O, Perkins, Diana O, Wright, Fred A
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2009
Materias:	Methodology article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2813853/ https://www.ncbi.nlm.nih.gov/pubmed/20021653 http://dx.doi.org/10.1186/1471-2105-10-431

_version_	1782176958525734912
author	Jeffries, Clark D Ward, William O Perkins, Diana O Wright, Fred A
author_facet	Jeffries, Clark D Ward, William O Perkins, Diana O Wright, Fred A
author_sort	Jeffries, Clark D
collection	PubMed
description	BACKGROUND: Improvements in high-throughput technology and its increasing use have led to the generation of many highly complex datasets that often address similar biological questions. Combining information from these studies can increase the reliability and generalizability of results and also yield new insights that guide future research. RESULTS: This paper describes a novel algorithm called BLANKET for symmetric analysis of two experiments that assess informativeness of descriptors. The experiments are required to be related only in that their descriptor sets intersect substantially and their definitions of case and control are consistent. From resulting lists of n descriptors ranked by informativeness, BLANKET determines shortlists of descriptors from each experiment, generally of different lengths p and q. For any pair of shortlists, four numbers are evident: the number of descriptors appearing in both shortlists, in exactly one shortlist, or in neither shortlist. From the associated contingency table, BLANKET computes Right Fisher Exact Test (RFET) values used as scores over a plane of possible pairs of shortlist lengths [1,2]. BLANKET then chooses a pair or pairs with RFET score less than a threshold; the threshold depends upon n and shortlist length limits and represents a quality of intersection achieved by less than 5% of random lists. CONCLUSIONS: Researchers seek within a universe of descriptors some minimal subset that collectively and efficiently predicts experimental outcomes. Ideally, any smaller subset should be insufficient for reliable prediction and any larger subset should have little additional accuracy. As a method, BLANKET is easy to conceptualize and presents only moderate computational complexity. Many existing databases could be mined using BLANKET to suggest optimal sets of predictive descriptors.
format	Text
id	pubmed-2813853
institution	National Center for Biotechnology Information
language	English
publishDate	2009
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-28138532010-01-30 Discovering collectively informative descriptors from high-throughput experiments Jeffries, Clark D Ward, William O Perkins, Diana O Wright, Fred A BMC Bioinformatics Methodology article BACKGROUND: Improvements in high-throughput technology and its increasing use have led to the generation of many highly complex datasets that often address similar biological questions. Combining information from these studies can increase the reliability and generalizability of results and also yield new insights that guide future research. RESULTS: This paper describes a novel algorithm called BLANKET for symmetric analysis of two experiments that assess informativeness of descriptors. The experiments are required to be related only in that their descriptor sets intersect substantially and their definitions of case and control are consistent. From resulting lists of n descriptors ranked by informativeness, BLANKET determines shortlists of descriptors from each experiment, generally of different lengths p and q. For any pair of shortlists, four numbers are evident: the number of descriptors appearing in both shortlists, in exactly one shortlist, or in neither shortlist. From the associated contingency table, BLANKET computes Right Fisher Exact Test (RFET) values used as scores over a plane of possible pairs of shortlist lengths [1,2]. BLANKET then chooses a pair or pairs with RFET score less than a threshold; the threshold depends upon n and shortlist length limits and represents a quality of intersection achieved by less than 5% of random lists. CONCLUSIONS: Researchers seek within a universe of descriptors some minimal subset that collectively and efficiently predicts experimental outcomes. Ideally, any smaller subset should be insufficient for reliable prediction and any larger subset should have little additional accuracy. As a method, BLANKET is easy to conceptualize and presents only moderate computational complexity. Many existing databases could be mined using BLANKET to suggest optimal sets of predictive descriptors. BioMed Central 2009-12-18 /pmc/articles/PMC2813853/ /pubmed/20021653 http://dx.doi.org/10.1186/1471-2105-10-431 Text en Copyright ©2009 Jeffries et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methodology article Jeffries, Clark D Ward, William O Perkins, Diana O Wright, Fred A Discovering collectively informative descriptors from high-throughput experiments
title	Discovering collectively informative descriptors from high-throughput experiments
title_full	Discovering collectively informative descriptors from high-throughput experiments
title_fullStr	Discovering collectively informative descriptors from high-throughput experiments
title_full_unstemmed	Discovering collectively informative descriptors from high-throughput experiments
title_short	Discovering collectively informative descriptors from high-throughput experiments
title_sort	discovering collectively informative descriptors from high-throughput experiments
topic	Methodology article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2813853/ https://www.ncbi.nlm.nih.gov/pubmed/20021653 http://dx.doi.org/10.1186/1471-2105-10-431
work_keys_str_mv	AT jeffriesclarkd discoveringcollectivelyinformativedescriptorsfromhighthroughputexperiments AT wardwilliamo discoveringcollectivelyinformativedescriptorsfromhighthroughputexperiments AT perkinsdianao discoveringcollectivelyinformativedescriptorsfromhighthroughputexperiments AT wrightfreda discoveringcollectivelyinformativedescriptorsfromhighthroughputexperiments

Discovering collectively informative descriptors from high-throughput experiments

Ejemplares similares