Cargando…
tableone: An open source Python package for producing summary statistics for research papers
OBJECTIVES: In quantitative research, understanding basic parameters of the study population is key for interpretation of the results. As a result, it is typical for the first table (“Table 1”) of a research paper to include summary statistics for the study data. Our objectives are 2-fold. First, we...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6951995/ https://www.ncbi.nlm.nih.gov/pubmed/31984317 http://dx.doi.org/10.1093/jamiaopen/ooy012 |
_version_ | 1783486370935734272 |
---|---|
author | Pollard, Tom J Johnson, Alistair E W Raffa, Jesse D Mark, Roger G |
author_facet | Pollard, Tom J Johnson, Alistair E W Raffa, Jesse D Mark, Roger G |
author_sort | Pollard, Tom J |
collection | PubMed |
description | OBJECTIVES: In quantitative research, understanding basic parameters of the study population is key for interpretation of the results. As a result, it is typical for the first table (“Table 1”) of a research paper to include summary statistics for the study data. Our objectives are 2-fold. First, we seek to provide a simple, reproducible method for providing summary statistics for research papers in the Python programming language. Second, we seek to use the package to improve the quality of summary statistics reported in research papers. MATERIALS AND METHODS: The tableone package is developed following good practice guidelines for scientific computing and all code is made available under a permissive MIT License. A testing framework runs on a continuous integration server, helping to maintain code stability. Issues are tracked openly and public contributions are encouraged. RESULTS: The tableone software package automatically compiles summary statistics into publishable formats such as CSV, HTML, and LaTeX. An executable Jupyter Notebook demonstrates application of the package to a subset of data from the MIMIC-III database. Tests such as Tukey’s rule for outlier detection and Hartigan’s Dip Test for modality are computed to highlight potential issues in summarizing the data. DISCUSSION AND CONCLUSION: We present open source software for researchers to facilitate carrying out reproducible studies in Python, an increasingly popular language in scientific research. The toolkit is intended to mature over time with community feedback and input. Development of a common tool for summarizing data may help to promote good practice when used as a supplement to existing guidelines and recommendations. We encourage use of tableone alongside other methods of descriptive statistics and, in particular, visualization to ensure appropriate data handling. We also suggest seeking guidance from a statistician when using tableone for a research study, especially prior to submitting the study for publication. |
format | Online Article Text |
id | pubmed-6951995 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-69519952020-01-24 tableone: An open source Python package for producing summary statistics for research papers Pollard, Tom J Johnson, Alistair E W Raffa, Jesse D Mark, Roger G JAMIA Open Application Notes OBJECTIVES: In quantitative research, understanding basic parameters of the study population is key for interpretation of the results. As a result, it is typical for the first table (“Table 1”) of a research paper to include summary statistics for the study data. Our objectives are 2-fold. First, we seek to provide a simple, reproducible method for providing summary statistics for research papers in the Python programming language. Second, we seek to use the package to improve the quality of summary statistics reported in research papers. MATERIALS AND METHODS: The tableone package is developed following good practice guidelines for scientific computing and all code is made available under a permissive MIT License. A testing framework runs on a continuous integration server, helping to maintain code stability. Issues are tracked openly and public contributions are encouraged. RESULTS: The tableone software package automatically compiles summary statistics into publishable formats such as CSV, HTML, and LaTeX. An executable Jupyter Notebook demonstrates application of the package to a subset of data from the MIMIC-III database. Tests such as Tukey’s rule for outlier detection and Hartigan’s Dip Test for modality are computed to highlight potential issues in summarizing the data. DISCUSSION AND CONCLUSION: We present open source software for researchers to facilitate carrying out reproducible studies in Python, an increasingly popular language in scientific research. The toolkit is intended to mature over time with community feedback and input. Development of a common tool for summarizing data may help to promote good practice when used as a supplement to existing guidelines and recommendations. We encourage use of tableone alongside other methods of descriptive statistics and, in particular, visualization to ensure appropriate data handling. We also suggest seeking guidance from a statistician when using tableone for a research study, especially prior to submitting the study for publication. Oxford University Press 2018-05-23 /pmc/articles/PMC6951995/ /pubmed/31984317 http://dx.doi.org/10.1093/jamiaopen/ooy012 Text en © The Author(s) 2018. Published by Oxford University Press on behalf of the American Medical Informatics Association. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Application Notes Pollard, Tom J Johnson, Alistair E W Raffa, Jesse D Mark, Roger G tableone: An open source Python package for producing summary statistics for research papers |
title |
tableone: An open source Python package for producing summary statistics for research papers |
title_full |
tableone: An open source Python package for producing summary statistics for research papers |
title_fullStr |
tableone: An open source Python package for producing summary statistics for research papers |
title_full_unstemmed |
tableone: An open source Python package for producing summary statistics for research papers |
title_short |
tableone: An open source Python package for producing summary statistics for research papers |
title_sort | tableone: an open source python package for producing summary statistics for research papers |
topic | Application Notes |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6951995/ https://www.ncbi.nlm.nih.gov/pubmed/31984317 http://dx.doi.org/10.1093/jamiaopen/ooy012 |
work_keys_str_mv | AT pollardtomj tableoneanopensourcepythonpackageforproducingsummarystatisticsforresearchpapers AT johnsonalistairew tableoneanopensourcepythonpackageforproducingsummarystatisticsforresearchpapers AT raffajessed tableoneanopensourcepythonpackageforproducingsummarystatisticsforresearchpapers AT markrogerg tableoneanopensourcepythonpackageforproducingsummarystatisticsforresearchpapers |