Cargando…

Single-molecule dataset (SMD): a generalized storage format for raw and processed single-molecule data

BACKGROUND: Single-molecule techniques have emerged as incisive approaches for addressing a wide range of questions arising in contemporary biological research [Trends Biochem Sci 38:30–37, 2013; Nat Rev Genet 14:9–22, 2013; Curr Opin Struct Biol 2014, 28C:112–121; Annu Rev Biophys 43:19–39, 2014]....

Descripción completa

Detalles Bibliográficos
Autores principales: Greenfeld, Max, van de Meent, Jan-Willem, Pavlichin, Dmitri S, Mabuchi, Hideo, Wiggins, Chris H, Gonzalez, Ruben L, Herschlag, Daniel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4384321/
https://www.ncbi.nlm.nih.gov/pubmed/25591752
http://dx.doi.org/10.1186/s12859-014-0429-4
_version_ 1782364886315040768
author Greenfeld, Max
van de Meent, Jan-Willem
Pavlichin, Dmitri S
Mabuchi, Hideo
Wiggins, Chris H
Gonzalez, Ruben L
Herschlag, Daniel
author_facet Greenfeld, Max
van de Meent, Jan-Willem
Pavlichin, Dmitri S
Mabuchi, Hideo
Wiggins, Chris H
Gonzalez, Ruben L
Herschlag, Daniel
author_sort Greenfeld, Max
collection PubMed
description BACKGROUND: Single-molecule techniques have emerged as incisive approaches for addressing a wide range of questions arising in contemporary biological research [Trends Biochem Sci 38:30–37, 2013; Nat Rev Genet 14:9–22, 2013; Curr Opin Struct Biol 2014, 28C:112–121; Annu Rev Biophys 43:19–39, 2014]. The analysis and interpretation of raw single-molecule data benefits greatly from the ongoing development of sophisticated statistical analysis tools that enable accurate inference at the low signal-to-noise ratios frequently associated with these measurements. While a number of groups have released analysis toolkits as open source software [J Phys Chem B 114:5386–5403, 2010; Biophys J 79:1915–1927, 2000; Biophys J 91:1941–1951, 2006; Biophys J 79:1928–1944, 2000; Biophys J 86:4015–4029, 2004; Biophys J 97:3196–3205, 2009; PLoS One 7:e30024, 2012; BMC Bioinformatics 288 11(8):S2, 2010; Biophys J 106:1327–1337, 2014; Proc Int Conf Mach Learn 28:361–369, 2013], it remains difficult to compare analysis for experiments performed in different labs due to a lack of standardization. RESULTS: Here we propose a standardized single-molecule dataset (SMD) file format. SMD is designed to accommodate a wide variety of computer programming languages, single-molecule techniques, and analysis strategies. To facilitate adoption of this format we have made two existing data analysis packages that are used for single-molecule analysis compatible with this format. CONCLUSION: Adoption of a common, standard data file format for sharing raw single-molecule data and analysis outcomes is a critical step for the emerging and powerful single-molecule field, which will benefit both sophisticated users and non-specialists by allowing standardized, transparent, and reproducible analysis practices. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-014-0429-4) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4384321
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-43843212015-04-04 Single-molecule dataset (SMD): a generalized storage format for raw and processed single-molecule data Greenfeld, Max van de Meent, Jan-Willem Pavlichin, Dmitri S Mabuchi, Hideo Wiggins, Chris H Gonzalez, Ruben L Herschlag, Daniel BMC Bioinformatics Software BACKGROUND: Single-molecule techniques have emerged as incisive approaches for addressing a wide range of questions arising in contemporary biological research [Trends Biochem Sci 38:30–37, 2013; Nat Rev Genet 14:9–22, 2013; Curr Opin Struct Biol 2014, 28C:112–121; Annu Rev Biophys 43:19–39, 2014]. The analysis and interpretation of raw single-molecule data benefits greatly from the ongoing development of sophisticated statistical analysis tools that enable accurate inference at the low signal-to-noise ratios frequently associated with these measurements. While a number of groups have released analysis toolkits as open source software [J Phys Chem B 114:5386–5403, 2010; Biophys J 79:1915–1927, 2000; Biophys J 91:1941–1951, 2006; Biophys J 79:1928–1944, 2000; Biophys J 86:4015–4029, 2004; Biophys J 97:3196–3205, 2009; PLoS One 7:e30024, 2012; BMC Bioinformatics 288 11(8):S2, 2010; Biophys J 106:1327–1337, 2014; Proc Int Conf Mach Learn 28:361–369, 2013], it remains difficult to compare analysis for experiments performed in different labs due to a lack of standardization. RESULTS: Here we propose a standardized single-molecule dataset (SMD) file format. SMD is designed to accommodate a wide variety of computer programming languages, single-molecule techniques, and analysis strategies. To facilitate adoption of this format we have made two existing data analysis packages that are used for single-molecule analysis compatible with this format. CONCLUSION: Adoption of a common, standard data file format for sharing raw single-molecule data and analysis outcomes is a critical step for the emerging and powerful single-molecule field, which will benefit both sophisticated users and non-specialists by allowing standardized, transparent, and reproducible analysis practices. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-014-0429-4) contains supplementary material, which is available to authorized users. BioMed Central 2015-01-16 /pmc/articles/PMC4384321/ /pubmed/25591752 http://dx.doi.org/10.1186/s12859-014-0429-4 Text en © Greenfeld et al.; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Greenfeld, Max
van de Meent, Jan-Willem
Pavlichin, Dmitri S
Mabuchi, Hideo
Wiggins, Chris H
Gonzalez, Ruben L
Herschlag, Daniel
Single-molecule dataset (SMD): a generalized storage format for raw and processed single-molecule data
title Single-molecule dataset (SMD): a generalized storage format for raw and processed single-molecule data
title_full Single-molecule dataset (SMD): a generalized storage format for raw and processed single-molecule data
title_fullStr Single-molecule dataset (SMD): a generalized storage format for raw and processed single-molecule data
title_full_unstemmed Single-molecule dataset (SMD): a generalized storage format for raw and processed single-molecule data
title_short Single-molecule dataset (SMD): a generalized storage format for raw and processed single-molecule data
title_sort single-molecule dataset (smd): a generalized storage format for raw and processed single-molecule data
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4384321/
https://www.ncbi.nlm.nih.gov/pubmed/25591752
http://dx.doi.org/10.1186/s12859-014-0429-4
work_keys_str_mv AT greenfeldmax singlemoleculedatasetsmdageneralizedstorageformatforrawandprocessedsinglemoleculedata
AT vandemeentjanwillem singlemoleculedatasetsmdageneralizedstorageformatforrawandprocessedsinglemoleculedata
AT pavlichindmitris singlemoleculedatasetsmdageneralizedstorageformatforrawandprocessedsinglemoleculedata
AT mabuchihideo singlemoleculedatasetsmdageneralizedstorageformatforrawandprocessedsinglemoleculedata
AT wigginschrish singlemoleculedatasetsmdageneralizedstorageformatforrawandprocessedsinglemoleculedata
AT gonzalezrubenl singlemoleculedatasetsmdageneralizedstorageformatforrawandprocessedsinglemoleculedata
AT herschlagdaniel singlemoleculedatasetsmdageneralizedstorageformatforrawandprocessedsinglemoleculedata