Cargando…

Database integration of 4923 publicly-available samples of breast cancer molecular and clinical data

We outline a paradigm for meta-microarray database creation and integration with clinical variables. We use as our implementation example a breast cancer database linking RNA expression measurements (by microarray) and clinical variables, such as survival metrics and tumor size. Such an endeavor inv...

Descripción completa

Detalles Bibliográficos
Autores principales: Planey, Catherine R., Butte, Atul J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Medical Informatics Association 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3814460/
https://www.ncbi.nlm.nih.gov/pubmed/24303324
Descripción
Sumario:We outline a paradigm for meta-microarray database creation and integration with clinical variables. We use as our implementation example a breast cancer database linking RNA expression measurements (by microarray) and clinical variables, such as survival metrics and tumor size. Such an endeavor involves integrating across different microarray datasets as well as clinical parameters. To this end, we created a data curation and processing pipeline, formal database ontology, and SQL schema to optimally query, analyze and visualize data from over 30 publicly available breast cancer microarray studies listed in the Gene Expression Omnibus (GEO). We demonstrate several pilot examples using this database. This methodology serves as a model for future meta-analyses of complex public clinical datasets, in particular those in the field of cancer.