Cargando…

Adaptive informatics for multi-factorial and high content biological data

Whereas genomic data are universally machine-readable, data arising from imaging, multiplex biochemistry, flow cytometry and other cell- and tissue-based assays usually reside in loosely organized files of poorly documented provenance. This arises because the relational databases used in genomic res...

Descripción completa

Detalles Bibliográficos
Autores principales: Millard, Bjorn L, Niepel, Mario, Menden, Michael P, Muhlich, Jeremy L, Sorger, Peter K
Formato: Texto
Lenguaje:English
Publicado: 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3105758/
https://www.ncbi.nlm.nih.gov/pubmed/21516115
http://dx.doi.org/10.1038/nmeth.1600
_version_ 1782204735811485696
author Millard, Bjorn L
Niepel, Mario
Menden, Michael P
Muhlich, Jeremy L
Sorger, Peter K
author_facet Millard, Bjorn L
Niepel, Mario
Menden, Michael P
Muhlich, Jeremy L
Sorger, Peter K
author_sort Millard, Bjorn L
collection PubMed
description Whereas genomic data are universally machine-readable, data arising from imaging, multiplex biochemistry, flow cytometry and other cell- and tissue-based assays usually reside in loosely organized files of poorly documented provenance. This arises because the relational databases used in genomic research are difficult to adapt to rapidly evolving experimental designs, data formats and analytic algorithms. Here we describe an adaptive approach to managing experimental data based on semantically-typed data hypercubes (SDCubes) that combine Hierarchical Data Format 5 (HDF5) and Extensible Markup Language (XML) file types. We demonstrate the application of SDCube-based storage using ImageRail, a software package for high-throughput microscopy. Experimental design and its day-to-day evolution, not rigid standards, determine how ImageRail data are organized in SDCubes. We apply ImageRail to the collection and analysis of drug dose-response landscapes in human cell lines at the single-cell level.
format Text
id pubmed-3105758
institution National Center for Biotechnology Information
language English
publishDate 2011
record_format MEDLINE/PubMed
spelling pubmed-31057582011-12-01 Adaptive informatics for multi-factorial and high content biological data Millard, Bjorn L Niepel, Mario Menden, Michael P Muhlich, Jeremy L Sorger, Peter K Nat Methods Article Whereas genomic data are universally machine-readable, data arising from imaging, multiplex biochemistry, flow cytometry and other cell- and tissue-based assays usually reside in loosely organized files of poorly documented provenance. This arises because the relational databases used in genomic research are difficult to adapt to rapidly evolving experimental designs, data formats and analytic algorithms. Here we describe an adaptive approach to managing experimental data based on semantically-typed data hypercubes (SDCubes) that combine Hierarchical Data Format 5 (HDF5) and Extensible Markup Language (XML) file types. We demonstrate the application of SDCube-based storage using ImageRail, a software package for high-throughput microscopy. Experimental design and its day-to-day evolution, not rigid standards, determine how ImageRail data are organized in SDCubes. We apply ImageRail to the collection and analysis of drug dose-response landscapes in human cell lines at the single-cell level. 2011-04-24 2011-06 /pmc/articles/PMC3105758/ /pubmed/21516115 http://dx.doi.org/10.1038/nmeth.1600 Text en Users may view, print, copy, download and text and data- mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms
spellingShingle Article
Millard, Bjorn L
Niepel, Mario
Menden, Michael P
Muhlich, Jeremy L
Sorger, Peter K
Adaptive informatics for multi-factorial and high content biological data
title Adaptive informatics for multi-factorial and high content biological data
title_full Adaptive informatics for multi-factorial and high content biological data
title_fullStr Adaptive informatics for multi-factorial and high content biological data
title_full_unstemmed Adaptive informatics for multi-factorial and high content biological data
title_short Adaptive informatics for multi-factorial and high content biological data
title_sort adaptive informatics for multi-factorial and high content biological data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3105758/
https://www.ncbi.nlm.nih.gov/pubmed/21516115
http://dx.doi.org/10.1038/nmeth.1600
work_keys_str_mv AT millardbjornl adaptiveinformaticsformultifactorialandhighcontentbiologicaldata
AT niepelmario adaptiveinformaticsformultifactorialandhighcontentbiologicaldata
AT mendenmichaelp adaptiveinformaticsformultifactorialandhighcontentbiologicaldata
AT muhlichjeremyl adaptiveinformaticsformultifactorialandhighcontentbiologicaldata
AT sorgerpeterk adaptiveinformaticsformultifactorialandhighcontentbiologicaldata