Cargando…

Improving Metadata Infrastructure for Complex Surveys: Insights from the Fragile Families Challenge

Researchers rely on metadata systems to prepare data for analysis. As the complexity of data sets increases and the breadth of data analysis practices grow, existing metadata systems can limit the efficiency and quality of data preparation. This article describes the redesign of a metadata system su...

Descripción completa

Detalles Bibliográficos
Autores principales: Kindel, Alexander T., Bansal, Vineet, Catena, Kristin D., Hartshorne, Thomas H., Jaeger, Kate, Koffman, Dawn, McLanahan, Sara, Phillips, Maya, Rouhani, Shiva, Vinh, Ryan, Salganik, Matthew J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10198672/
https://www.ncbi.nlm.nih.gov/pubmed/37214352
http://dx.doi.org/10.1177/2378023118817378
_version_ 1785044783663677440
author Kindel, Alexander T.
Bansal, Vineet
Catena, Kristin D.
Hartshorne, Thomas H.
Jaeger, Kate
Koffman, Dawn
McLanahan, Sara
Phillips, Maya
Rouhani, Shiva
Vinh, Ryan
Salganik, Matthew J.
author_facet Kindel, Alexander T.
Bansal, Vineet
Catena, Kristin D.
Hartshorne, Thomas H.
Jaeger, Kate
Koffman, Dawn
McLanahan, Sara
Phillips, Maya
Rouhani, Shiva
Vinh, Ryan
Salganik, Matthew J.
author_sort Kindel, Alexander T.
collection PubMed
description Researchers rely on metadata systems to prepare data for analysis. As the complexity of data sets increases and the breadth of data analysis practices grow, existing metadata systems can limit the efficiency and quality of data preparation. This article describes the redesign of a metadata system supporting the Fragile Families and Child Wellbeing Study on the basis of the experiences of participants in the Fragile Families Challenge. The authors demonstrate how treating metadata as data (i.e., releasing comprehensive information about variables in a format amenable to both automated and manual processing) can make the task of data preparation less arduous and less error prone for all types of data analysis. The authors hope that their work will facilitate new applications of machine-learning methods to longitudinal surveys and inspire research on data preparation in the social sciences. The authors have open-sourced the tools they created so that others can use and improve them.
format Online
Article
Text
id pubmed-10198672
institution National Center for Biotechnology Information
language English
publishDate 2019
record_format MEDLINE/PubMed
spelling pubmed-101986722023-05-19 Improving Metadata Infrastructure for Complex Surveys: Insights from the Fragile Families Challenge Kindel, Alexander T. Bansal, Vineet Catena, Kristin D. Hartshorne, Thomas H. Jaeger, Kate Koffman, Dawn McLanahan, Sara Phillips, Maya Rouhani, Shiva Vinh, Ryan Salganik, Matthew J. Socius Article Researchers rely on metadata systems to prepare data for analysis. As the complexity of data sets increases and the breadth of data analysis practices grow, existing metadata systems can limit the efficiency and quality of data preparation. This article describes the redesign of a metadata system supporting the Fragile Families and Child Wellbeing Study on the basis of the experiences of participants in the Fragile Families Challenge. The authors demonstrate how treating metadata as data (i.e., releasing comprehensive information about variables in a format amenable to both automated and manual processing) can make the task of data preparation less arduous and less error prone for all types of data analysis. The authors hope that their work will facilitate new applications of machine-learning methods to longitudinal surveys and inspire research on data preparation in the social sciences. The authors have open-sourced the tools they created so that others can use and improve them. 2019 2019-09-10 /pmc/articles/PMC10198672/ /pubmed/37214352 http://dx.doi.org/10.1177/2378023118817378 Text en Article reuse guidelines: sagepub.com/journals-permissions (https://us.sagepub.com/en-us/journals-permissions) https://creativecommons.org/licenses/by-nc/4.0/Creative Commons Non Commercial CC BY-NC: This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (http://www.creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle Article
Kindel, Alexander T.
Bansal, Vineet
Catena, Kristin D.
Hartshorne, Thomas H.
Jaeger, Kate
Koffman, Dawn
McLanahan, Sara
Phillips, Maya
Rouhani, Shiva
Vinh, Ryan
Salganik, Matthew J.
Improving Metadata Infrastructure for Complex Surveys: Insights from the Fragile Families Challenge
title Improving Metadata Infrastructure for Complex Surveys: Insights from the Fragile Families Challenge
title_full Improving Metadata Infrastructure for Complex Surveys: Insights from the Fragile Families Challenge
title_fullStr Improving Metadata Infrastructure for Complex Surveys: Insights from the Fragile Families Challenge
title_full_unstemmed Improving Metadata Infrastructure for Complex Surveys: Insights from the Fragile Families Challenge
title_short Improving Metadata Infrastructure for Complex Surveys: Insights from the Fragile Families Challenge
title_sort improving metadata infrastructure for complex surveys: insights from the fragile families challenge
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10198672/
https://www.ncbi.nlm.nih.gov/pubmed/37214352
http://dx.doi.org/10.1177/2378023118817378
work_keys_str_mv AT kindelalexandert improvingmetadatainfrastructureforcomplexsurveysinsightsfromthefragilefamilieschallenge
AT bansalvineet improvingmetadatainfrastructureforcomplexsurveysinsightsfromthefragilefamilieschallenge
AT catenakristind improvingmetadatainfrastructureforcomplexsurveysinsightsfromthefragilefamilieschallenge
AT hartshornethomash improvingmetadatainfrastructureforcomplexsurveysinsightsfromthefragilefamilieschallenge
AT jaegerkate improvingmetadatainfrastructureforcomplexsurveysinsightsfromthefragilefamilieschallenge
AT koffmandawn improvingmetadatainfrastructureforcomplexsurveysinsightsfromthefragilefamilieschallenge
AT mclanahansara improvingmetadatainfrastructureforcomplexsurveysinsightsfromthefragilefamilieschallenge
AT phillipsmaya improvingmetadatainfrastructureforcomplexsurveysinsightsfromthefragilefamilieschallenge
AT rouhanishiva improvingmetadatainfrastructureforcomplexsurveysinsightsfromthefragilefamilieschallenge
AT vinhryan improvingmetadatainfrastructureforcomplexsurveysinsightsfromthefragilefamilieschallenge
AT salganikmatthewj improvingmetadatainfrastructureforcomplexsurveysinsightsfromthefragilefamilieschallenge