Cargando…

Sustainable data and metadata management at the BD2K-LINCS Data Coordination and Integration Center

The NIH-funded LINCS Consortium is creating an extensive reference library of cell-based perturbation response signatures and sophisticated informatics tools incorporating a large number of perturbagens, model systems, and assays. To date, more than 350 datasets have been generated including transcr...

Descripción completa

Detalles Bibliográficos
Autores principales: Stathias, Vasileios, Koleti, Amar, Vidović, Dušica, Cooper, Daniel J., Jagodnik, Kathleen M., Terryn, Raymond, Forlin, Michele, Chung, Caty, Torre, Denis, Ayad, Nagi, Medvedovic, Mario, Ma'ayan, Avi, Pillai, Ajay, Schürer, Stephan C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6007090/
https://www.ncbi.nlm.nih.gov/pubmed/29917015
http://dx.doi.org/10.1038/sdata.2018.117
_version_ 1783332970834165760
author Stathias, Vasileios
Koleti, Amar
Vidović, Dušica
Cooper, Daniel J.
Jagodnik, Kathleen M.
Terryn, Raymond
Forlin, Michele
Chung, Caty
Torre, Denis
Ayad, Nagi
Medvedovic, Mario
Ma'ayan, Avi
Pillai, Ajay
Schürer, Stephan C.
author_facet Stathias, Vasileios
Koleti, Amar
Vidović, Dušica
Cooper, Daniel J.
Jagodnik, Kathleen M.
Terryn, Raymond
Forlin, Michele
Chung, Caty
Torre, Denis
Ayad, Nagi
Medvedovic, Mario
Ma'ayan, Avi
Pillai, Ajay
Schürer, Stephan C.
author_sort Stathias, Vasileios
collection PubMed
description The NIH-funded LINCS Consortium is creating an extensive reference library of cell-based perturbation response signatures and sophisticated informatics tools incorporating a large number of perturbagens, model systems, and assays. To date, more than 350 datasets have been generated including transcriptomics, proteomics, epigenomics, cell phenotype and competitive binding profiling assays. The large volume and variety of data necessitate rigorous data standards and effective data management including modular data processing pipelines and end-user interfaces to facilitate accurate and reliable data exchange, curation, validation, standardization, aggregation, integration, and end user access. Deep metadata annotations and the use of qualified data standards enable integration with many external resources. Here we describe the end-to-end data processing and management at the DCIC to generate a high-quality and persistent product. Our data management and stewardship solutions enable a functioning Consortium and make LINCS a valuable scientific resource that aligns with big data initiatives such as the BD2K NIH Program and concords with emerging data science best practices including the findable, accessible, interoperable, and reusable (FAIR) principles.
format Online
Article
Text
id pubmed-6007090
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-60070902018-06-27 Sustainable data and metadata management at the BD2K-LINCS Data Coordination and Integration Center Stathias, Vasileios Koleti, Amar Vidović, Dušica Cooper, Daniel J. Jagodnik, Kathleen M. Terryn, Raymond Forlin, Michele Chung, Caty Torre, Denis Ayad, Nagi Medvedovic, Mario Ma'ayan, Avi Pillai, Ajay Schürer, Stephan C. Sci Data Article The NIH-funded LINCS Consortium is creating an extensive reference library of cell-based perturbation response signatures and sophisticated informatics tools incorporating a large number of perturbagens, model systems, and assays. To date, more than 350 datasets have been generated including transcriptomics, proteomics, epigenomics, cell phenotype and competitive binding profiling assays. The large volume and variety of data necessitate rigorous data standards and effective data management including modular data processing pipelines and end-user interfaces to facilitate accurate and reliable data exchange, curation, validation, standardization, aggregation, integration, and end user access. Deep metadata annotations and the use of qualified data standards enable integration with many external resources. Here we describe the end-to-end data processing and management at the DCIC to generate a high-quality and persistent product. Our data management and stewardship solutions enable a functioning Consortium and make LINCS a valuable scientific resource that aligns with big data initiatives such as the BD2K NIH Program and concords with emerging data science best practices including the findable, accessible, interoperable, and reusable (FAIR) principles. Nature Publishing Group 2018-06-19 /pmc/articles/PMC6007090/ /pubmed/29917015 http://dx.doi.org/10.1038/sdata.2018.117 Text en Copyright © 2018, The Author(s) http://creativecommons.org/licenses/by/4.0/ Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Stathias, Vasileios
Koleti, Amar
Vidović, Dušica
Cooper, Daniel J.
Jagodnik, Kathleen M.
Terryn, Raymond
Forlin, Michele
Chung, Caty
Torre, Denis
Ayad, Nagi
Medvedovic, Mario
Ma'ayan, Avi
Pillai, Ajay
Schürer, Stephan C.
Sustainable data and metadata management at the BD2K-LINCS Data Coordination and Integration Center
title Sustainable data and metadata management at the BD2K-LINCS Data Coordination and Integration Center
title_full Sustainable data and metadata management at the BD2K-LINCS Data Coordination and Integration Center
title_fullStr Sustainable data and metadata management at the BD2K-LINCS Data Coordination and Integration Center
title_full_unstemmed Sustainable data and metadata management at the BD2K-LINCS Data Coordination and Integration Center
title_short Sustainable data and metadata management at the BD2K-LINCS Data Coordination and Integration Center
title_sort sustainable data and metadata management at the bd2k-lincs data coordination and integration center
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6007090/
https://www.ncbi.nlm.nih.gov/pubmed/29917015
http://dx.doi.org/10.1038/sdata.2018.117
work_keys_str_mv AT stathiasvasileios sustainabledataandmetadatamanagementatthebd2klincsdatacoordinationandintegrationcenter
AT koletiamar sustainabledataandmetadatamanagementatthebd2klincsdatacoordinationandintegrationcenter
AT vidovicdusica sustainabledataandmetadatamanagementatthebd2klincsdatacoordinationandintegrationcenter
AT cooperdanielj sustainabledataandmetadatamanagementatthebd2klincsdatacoordinationandintegrationcenter
AT jagodnikkathleenm sustainabledataandmetadatamanagementatthebd2klincsdatacoordinationandintegrationcenter
AT terrynraymond sustainabledataandmetadatamanagementatthebd2klincsdatacoordinationandintegrationcenter
AT forlinmichele sustainabledataandmetadatamanagementatthebd2klincsdatacoordinationandintegrationcenter
AT chungcaty sustainabledataandmetadatamanagementatthebd2klincsdatacoordinationandintegrationcenter
AT torredenis sustainabledataandmetadatamanagementatthebd2klincsdatacoordinationandintegrationcenter
AT ayadnagi sustainabledataandmetadatamanagementatthebd2klincsdatacoordinationandintegrationcenter
AT medvedovicmario sustainabledataandmetadatamanagementatthebd2klincsdatacoordinationandintegrationcenter
AT maayanavi sustainabledataandmetadatamanagementatthebd2klincsdatacoordinationandintegrationcenter
AT pillaiajay sustainabledataandmetadatamanagementatthebd2klincsdatacoordinationandintegrationcenter
AT schurerstephanc sustainabledataandmetadatamanagementatthebd2klincsdatacoordinationandintegrationcenter