Cargando…
Matrix and analysis metadata standards (MAMS) to facilitate harmonization and reproducibility of single-cell data
A large number of genomic and imaging datasets are being produced by consortia that seek to characterize healthy and disease tissues at single-cell resolution. While much effort has been devoted to capturing information related to biospecimen information and experimental procedures, the metadata sta...
Autores principales: | , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10028847/ https://www.ncbi.nlm.nih.gov/pubmed/36945543 http://dx.doi.org/10.1101/2023.03.06.531314 |
_version_ | 1784910030888239104 |
---|---|
author | Wang, Yichen Sarfraz, Irzam Teh, Wei Kheng Sokolov, Artem Herb, Brian R. Creasy, Heather H. Virshup, Isaac Dries, Ruben Degatano, Kylee Mahurkar, Anup Schnell, Daniel J Madrigal, Pedro Hilton, Jason Gehlenborg, Nils Tickle, Timothy Campbell, Joshua D. |
author_facet | Wang, Yichen Sarfraz, Irzam Teh, Wei Kheng Sokolov, Artem Herb, Brian R. Creasy, Heather H. Virshup, Isaac Dries, Ruben Degatano, Kylee Mahurkar, Anup Schnell, Daniel J Madrigal, Pedro Hilton, Jason Gehlenborg, Nils Tickle, Timothy Campbell, Joshua D. |
author_sort | Wang, Yichen |
collection | PubMed |
description | A large number of genomic and imaging datasets are being produced by consortia that seek to characterize healthy and disease tissues at single-cell resolution. While much effort has been devoted to capturing information related to biospecimen information and experimental procedures, the metadata standards that describe data matrices and the analysis workflows that produced them are relatively lacking. Detailed metadata schema related to data analysis are needed to facilitate sharing and interoperability across groups and to promote data provenance for reproducibility. To address this need, we developed the Matrix and Analysis Metadata Standards (MAMS) to serve as a resource for data coordinating centers and tool developers. We first curated several simple and complex “use cases” to characterize the types of feature-observation matrices (FOMs), annotations, and analysis metadata produced in different workflows. Based on these use cases, metadata fields were defined to describe the data contained within each matrix including those related to processing, modality, and subsets. Suggested terms were created for the majority of fields to aid in harmonization of metadata terms across groups. Additional provenance metadata fields were also defined to describe the software and workflows that produced each FOM. Finally, we developed a simple list-like schema that can be used to store MAMS information and implemented in multiple formats. Overall, MAMS can be used as a guide to harmonize analysis-related metadata which will ultimately facilitate integration of datasets across tools and consortia. MAMS specifications, use cases, and examples can be found at https://github.com/single-cell-mams/mams/. |
format | Online Article Text |
id | pubmed-10028847 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-100288472023-03-22 Matrix and analysis metadata standards (MAMS) to facilitate harmonization and reproducibility of single-cell data Wang, Yichen Sarfraz, Irzam Teh, Wei Kheng Sokolov, Artem Herb, Brian R. Creasy, Heather H. Virshup, Isaac Dries, Ruben Degatano, Kylee Mahurkar, Anup Schnell, Daniel J Madrigal, Pedro Hilton, Jason Gehlenborg, Nils Tickle, Timothy Campbell, Joshua D. bioRxiv Article A large number of genomic and imaging datasets are being produced by consortia that seek to characterize healthy and disease tissues at single-cell resolution. While much effort has been devoted to capturing information related to biospecimen information and experimental procedures, the metadata standards that describe data matrices and the analysis workflows that produced them are relatively lacking. Detailed metadata schema related to data analysis are needed to facilitate sharing and interoperability across groups and to promote data provenance for reproducibility. To address this need, we developed the Matrix and Analysis Metadata Standards (MAMS) to serve as a resource for data coordinating centers and tool developers. We first curated several simple and complex “use cases” to characterize the types of feature-observation matrices (FOMs), annotations, and analysis metadata produced in different workflows. Based on these use cases, metadata fields were defined to describe the data contained within each matrix including those related to processing, modality, and subsets. Suggested terms were created for the majority of fields to aid in harmonization of metadata terms across groups. Additional provenance metadata fields were also defined to describe the software and workflows that produced each FOM. Finally, we developed a simple list-like schema that can be used to store MAMS information and implemented in multiple formats. Overall, MAMS can be used as a guide to harmonize analysis-related metadata which will ultimately facilitate integration of datasets across tools and consortia. MAMS specifications, use cases, and examples can be found at https://github.com/single-cell-mams/mams/. Cold Spring Harbor Laboratory 2023-03-07 /pmc/articles/PMC10028847/ /pubmed/36945543 http://dx.doi.org/10.1101/2023.03.06.531314 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use. |
spellingShingle | Article Wang, Yichen Sarfraz, Irzam Teh, Wei Kheng Sokolov, Artem Herb, Brian R. Creasy, Heather H. Virshup, Isaac Dries, Ruben Degatano, Kylee Mahurkar, Anup Schnell, Daniel J Madrigal, Pedro Hilton, Jason Gehlenborg, Nils Tickle, Timothy Campbell, Joshua D. Matrix and analysis metadata standards (MAMS) to facilitate harmonization and reproducibility of single-cell data |
title | Matrix and analysis metadata standards (MAMS) to facilitate harmonization and reproducibility of single-cell data |
title_full | Matrix and analysis metadata standards (MAMS) to facilitate harmonization and reproducibility of single-cell data |
title_fullStr | Matrix and analysis metadata standards (MAMS) to facilitate harmonization and reproducibility of single-cell data |
title_full_unstemmed | Matrix and analysis metadata standards (MAMS) to facilitate harmonization and reproducibility of single-cell data |
title_short | Matrix and analysis metadata standards (MAMS) to facilitate harmonization and reproducibility of single-cell data |
title_sort | matrix and analysis metadata standards (mams) to facilitate harmonization and reproducibility of single-cell data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10028847/ https://www.ncbi.nlm.nih.gov/pubmed/36945543 http://dx.doi.org/10.1101/2023.03.06.531314 |
work_keys_str_mv | AT wangyichen matrixandanalysismetadatastandardsmamstofacilitateharmonizationandreproducibilityofsinglecelldata AT sarfrazirzam matrixandanalysismetadatastandardsmamstofacilitateharmonizationandreproducibilityofsinglecelldata AT tehweikheng matrixandanalysismetadatastandardsmamstofacilitateharmonizationandreproducibilityofsinglecelldata AT sokolovartem matrixandanalysismetadatastandardsmamstofacilitateharmonizationandreproducibilityofsinglecelldata AT herbbrianr matrixandanalysismetadatastandardsmamstofacilitateharmonizationandreproducibilityofsinglecelldata AT creasyheatherh matrixandanalysismetadatastandardsmamstofacilitateharmonizationandreproducibilityofsinglecelldata AT virshupisaac matrixandanalysismetadatastandardsmamstofacilitateharmonizationandreproducibilityofsinglecelldata AT driesruben matrixandanalysismetadatastandardsmamstofacilitateharmonizationandreproducibilityofsinglecelldata AT degatanokylee matrixandanalysismetadatastandardsmamstofacilitateharmonizationandreproducibilityofsinglecelldata AT mahurkaranup matrixandanalysismetadatastandardsmamstofacilitateharmonizationandreproducibilityofsinglecelldata AT schnelldanielj matrixandanalysismetadatastandardsmamstofacilitateharmonizationandreproducibilityofsinglecelldata AT madrigalpedro matrixandanalysismetadatastandardsmamstofacilitateharmonizationandreproducibilityofsinglecelldata AT hiltonjason matrixandanalysismetadatastandardsmamstofacilitateharmonizationandreproducibilityofsinglecelldata AT gehlenborgnils matrixandanalysismetadatastandardsmamstofacilitateharmonizationandreproducibilityofsinglecelldata AT tickletimothy matrixandanalysismetadatastandardsmamstofacilitateharmonizationandreproducibilityofsinglecelldata AT campbelljoshuad matrixandanalysismetadatastandardsmamstofacilitateharmonizationandreproducibilityofsinglecelldata |