Cargando…

Visual programming for next-generation sequencing data analytics

BACKGROUND: High-throughput or next-generation sequencing (NGS) technologies have become an established and affordable experimental framework in biological and medical sciences for all basic and translational research. Processing and analyzing NGS data is challenging. NGS data are big, heterogeneous...

Descripción completa

Detalles Bibliográficos
Autores principales: Milicchio, Franco, Rose, Rebecca, Bian, Jiang, Min, Jae, Prosperi, Mattia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4848821/
https://www.ncbi.nlm.nih.gov/pubmed/27127540
http://dx.doi.org/10.1186/s13040-016-0095-3
_version_ 1782429425397137408
author Milicchio, Franco
Rose, Rebecca
Bian, Jiang
Min, Jae
Prosperi, Mattia
author_facet Milicchio, Franco
Rose, Rebecca
Bian, Jiang
Min, Jae
Prosperi, Mattia
author_sort Milicchio, Franco
collection PubMed
description BACKGROUND: High-throughput or next-generation sequencing (NGS) technologies have become an established and affordable experimental framework in biological and medical sciences for all basic and translational research. Processing and analyzing NGS data is challenging. NGS data are big, heterogeneous, sparse, and error prone. Although a plethora of tools for NGS data analysis has emerged in the past decade, (i) software development is still lagging behind data generation capabilities, and (ii) there is a ‘cultural’ gap between the end user and the developer. TEXT: Generic software template libraries specifically developed for NGS can help in dealing with the former problem, whilst coupling template libraries with visual programming may help with the latter. Here we scrutinize the state-of-the-art low-level software libraries implemented specifically for NGS and graphical tools for NGS analytics. An ideal developing environment for NGS should be modular (with a native library interface), scalable in computational methods (i.e. serial, multithread, distributed), transparent (platform-independent), interoperable (with external software interface), and usable (via an intuitive graphical user interface). These characteristics should facilitate both the run of standardized NGS pipelines and the development of new workflows based on technological advancements or users’ needs. We discuss in detail the potential of a computational framework blending generic template programming and visual programming that addresses all of the current limitations. CONCLUSION: In the long term, a proper, well-developed (although not necessarily unique) software framework will bridge the current gap between data generation and hypothesis testing. This will eventually facilitate the development of novel diagnostic tools embedded in routine healthcare.
format Online
Article
Text
id pubmed-4848821
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-48488212016-04-29 Visual programming for next-generation sequencing data analytics Milicchio, Franco Rose, Rebecca Bian, Jiang Min, Jae Prosperi, Mattia BioData Min Review BACKGROUND: High-throughput or next-generation sequencing (NGS) technologies have become an established and affordable experimental framework in biological and medical sciences for all basic and translational research. Processing and analyzing NGS data is challenging. NGS data are big, heterogeneous, sparse, and error prone. Although a plethora of tools for NGS data analysis has emerged in the past decade, (i) software development is still lagging behind data generation capabilities, and (ii) there is a ‘cultural’ gap between the end user and the developer. TEXT: Generic software template libraries specifically developed for NGS can help in dealing with the former problem, whilst coupling template libraries with visual programming may help with the latter. Here we scrutinize the state-of-the-art low-level software libraries implemented specifically for NGS and graphical tools for NGS analytics. An ideal developing environment for NGS should be modular (with a native library interface), scalable in computational methods (i.e. serial, multithread, distributed), transparent (platform-independent), interoperable (with external software interface), and usable (via an intuitive graphical user interface). These characteristics should facilitate both the run of standardized NGS pipelines and the development of new workflows based on technological advancements or users’ needs. We discuss in detail the potential of a computational framework blending generic template programming and visual programming that addresses all of the current limitations. CONCLUSION: In the long term, a proper, well-developed (although not necessarily unique) software framework will bridge the current gap between data generation and hypothesis testing. This will eventually facilitate the development of novel diagnostic tools embedded in routine healthcare. BioMed Central 2016-04-27 /pmc/articles/PMC4848821/ /pubmed/27127540 http://dx.doi.org/10.1186/s13040-016-0095-3 Text en © Milicchio et al. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Review
Milicchio, Franco
Rose, Rebecca
Bian, Jiang
Min, Jae
Prosperi, Mattia
Visual programming for next-generation sequencing data analytics
title Visual programming for next-generation sequencing data analytics
title_full Visual programming for next-generation sequencing data analytics
title_fullStr Visual programming for next-generation sequencing data analytics
title_full_unstemmed Visual programming for next-generation sequencing data analytics
title_short Visual programming for next-generation sequencing data analytics
title_sort visual programming for next-generation sequencing data analytics
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4848821/
https://www.ncbi.nlm.nih.gov/pubmed/27127540
http://dx.doi.org/10.1186/s13040-016-0095-3
work_keys_str_mv AT milicchiofranco visualprogrammingfornextgenerationsequencingdataanalytics
AT roserebecca visualprogrammingfornextgenerationsequencingdataanalytics
AT bianjiang visualprogrammingfornextgenerationsequencingdataanalytics
AT minjae visualprogrammingfornextgenerationsequencingdataanalytics
AT prosperimattia visualprogrammingfornextgenerationsequencingdataanalytics