Cargando…

DNAscan: personal computer compatible NGS analysis, annotation and visualisation

BACKGROUND: Next Generation Sequencing (NGS) is a commonly used technology for studying the genetic basis of biological processes and it underpins the aspirations of precision medicine. However, there are significant challenges when dealing with NGS data. Firstly, a huge number of bioinformatics too...

Descripción completa

Detalles Bibliográficos
Autores principales: Iacoangeli, A., Al Khleifat, A., Sproviero, W., Shatunov, A., Jones, A. R., Morgan, S. L., Pittman, A., Dobson, R. J., Newhouse, S. J., Al-Chalabi, A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6487045/
https://www.ncbi.nlm.nih.gov/pubmed/31029080
http://dx.doi.org/10.1186/s12859-019-2791-8
_version_ 1783414428831580160
author Iacoangeli, A.
Al Khleifat, A.
Sproviero, W.
Shatunov, A.
Jones, A. R.
Morgan, S. L.
Pittman, A.
Dobson, R. J.
Newhouse, S. J.
Al-Chalabi, A.
author_facet Iacoangeli, A.
Al Khleifat, A.
Sproviero, W.
Shatunov, A.
Jones, A. R.
Morgan, S. L.
Pittman, A.
Dobson, R. J.
Newhouse, S. J.
Al-Chalabi, A.
author_sort Iacoangeli, A.
collection PubMed
description BACKGROUND: Next Generation Sequencing (NGS) is a commonly used technology for studying the genetic basis of biological processes and it underpins the aspirations of precision medicine. However, there are significant challenges when dealing with NGS data. Firstly, a huge number of bioinformatics tools for a wide range of uses exist, therefore it is challenging to design an analysis pipeline. Secondly, NGS analysis is computationally intensive, requiring expensive infrastructure, and many medical and research centres do not have adequate high performance computing facilities and cloud computing is not always an option due to privacy and ownership issues. Finally, the interpretation of the results is not trivial and most available pipelines lack the utilities to favour this crucial step. RESULTS: We have therefore developed a fast and efficient bioinformatics pipeline that allows for the analysis of DNA sequencing data, while requiring little computational effort and memory usage. DNAscan can analyse a whole exome sequencing sample in 1 h and a 40x whole genome sequencing sample in 13 h, on a midrange computer. The pipeline can look for single nucleotide variants, small indels, structural variants, repeat expansions and viral genetic material (or any other organism). Its results are annotated using a customisable variety of databases and are available for an on-the-fly visualisation with a local deployment of the gene.iobio platform. DNAscan is implemented in Python. Its code and documentation are available on GitHub: https://github.com/KHP-Informatics/DNAscan. Instructions for an easy and fast deployment with Docker and Singularity are also provided on GitHub. CONCLUSIONS: DNAscan is an extremely fast and computationally efficient pipeline for analysis, visualization and interpretation of NGS data. It is designed to provide a powerful and easy-to-use tool for applications in biomedical research and diagnostic medicine, at minimal computational cost. Its comprehensive approach will maximise the potential audience of users, bringing such analyses within the reach of non-specialist laboratories, and those from centres with limited funding available. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2791-8) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6487045
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-64870452019-05-06 DNAscan: personal computer compatible NGS analysis, annotation and visualisation Iacoangeli, A. Al Khleifat, A. Sproviero, W. Shatunov, A. Jones, A. R. Morgan, S. L. Pittman, A. Dobson, R. J. Newhouse, S. J. Al-Chalabi, A. BMC Bioinformatics Methodology Article BACKGROUND: Next Generation Sequencing (NGS) is a commonly used technology for studying the genetic basis of biological processes and it underpins the aspirations of precision medicine. However, there are significant challenges when dealing with NGS data. Firstly, a huge number of bioinformatics tools for a wide range of uses exist, therefore it is challenging to design an analysis pipeline. Secondly, NGS analysis is computationally intensive, requiring expensive infrastructure, and many medical and research centres do not have adequate high performance computing facilities and cloud computing is not always an option due to privacy and ownership issues. Finally, the interpretation of the results is not trivial and most available pipelines lack the utilities to favour this crucial step. RESULTS: We have therefore developed a fast and efficient bioinformatics pipeline that allows for the analysis of DNA sequencing data, while requiring little computational effort and memory usage. DNAscan can analyse a whole exome sequencing sample in 1 h and a 40x whole genome sequencing sample in 13 h, on a midrange computer. The pipeline can look for single nucleotide variants, small indels, structural variants, repeat expansions and viral genetic material (or any other organism). Its results are annotated using a customisable variety of databases and are available for an on-the-fly visualisation with a local deployment of the gene.iobio platform. DNAscan is implemented in Python. Its code and documentation are available on GitHub: https://github.com/KHP-Informatics/DNAscan. Instructions for an easy and fast deployment with Docker and Singularity are also provided on GitHub. CONCLUSIONS: DNAscan is an extremely fast and computationally efficient pipeline for analysis, visualization and interpretation of NGS data. It is designed to provide a powerful and easy-to-use tool for applications in biomedical research and diagnostic medicine, at minimal computational cost. Its comprehensive approach will maximise the potential audience of users, bringing such analyses within the reach of non-specialist laboratories, and those from centres with limited funding available. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2791-8) contains supplementary material, which is available to authorized users. BioMed Central 2019-04-27 /pmc/articles/PMC6487045/ /pubmed/31029080 http://dx.doi.org/10.1186/s12859-019-2791-8 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Iacoangeli, A.
Al Khleifat, A.
Sproviero, W.
Shatunov, A.
Jones, A. R.
Morgan, S. L.
Pittman, A.
Dobson, R. J.
Newhouse, S. J.
Al-Chalabi, A.
DNAscan: personal computer compatible NGS analysis, annotation and visualisation
title DNAscan: personal computer compatible NGS analysis, annotation and visualisation
title_full DNAscan: personal computer compatible NGS analysis, annotation and visualisation
title_fullStr DNAscan: personal computer compatible NGS analysis, annotation and visualisation
title_full_unstemmed DNAscan: personal computer compatible NGS analysis, annotation and visualisation
title_short DNAscan: personal computer compatible NGS analysis, annotation and visualisation
title_sort dnascan: personal computer compatible ngs analysis, annotation and visualisation
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6487045/
https://www.ncbi.nlm.nih.gov/pubmed/31029080
http://dx.doi.org/10.1186/s12859-019-2791-8
work_keys_str_mv AT iacoangelia dnascanpersonalcomputercompatiblengsanalysisannotationandvisualisation
AT alkhleifata dnascanpersonalcomputercompatiblengsanalysisannotationandvisualisation
AT sprovierow dnascanpersonalcomputercompatiblengsanalysisannotationandvisualisation
AT shatunova dnascanpersonalcomputercompatiblengsanalysisannotationandvisualisation
AT jonesar dnascanpersonalcomputercompatiblengsanalysisannotationandvisualisation
AT morgansl dnascanpersonalcomputercompatiblengsanalysisannotationandvisualisation
AT pittmana dnascanpersonalcomputercompatiblengsanalysisannotationandvisualisation
AT dobsonrj dnascanpersonalcomputercompatiblengsanalysisannotationandvisualisation
AT newhousesj dnascanpersonalcomputercompatiblengsanalysisannotationandvisualisation
AT alchalabia dnascanpersonalcomputercompatiblengsanalysisannotationandvisualisation