Cargando…
NGSpop: A desktop software that supports population studies by identifying sequence variations from next-generation sequencing data
Next-generation sequencing (NGS) is widely used in all areas of genetic research, such as genetic disease diagnosis and breeding, and it can produce massive amounts of data. The identification of sequence variants is an important step when processing large NGS datasets; however, currently, the proce...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9671411/ https://www.ncbi.nlm.nih.gov/pubmed/36395265 http://dx.doi.org/10.1371/journal.pone.0260908 |
_version_ | 1784832537743327232 |
---|---|
author | Lee, Dong-Jun Kwon, Taesoo Lee, Hye-Jin Oh, Yun-Ho Kim, Jin-Hyun Lee, Tae-Ho |
author_facet | Lee, Dong-Jun Kwon, Taesoo Lee, Hye-Jin Oh, Yun-Ho Kim, Jin-Hyun Lee, Tae-Ho |
author_sort | Lee, Dong-Jun |
collection | PubMed |
description | Next-generation sequencing (NGS) is widely used in all areas of genetic research, such as genetic disease diagnosis and breeding, and it can produce massive amounts of data. The identification of sequence variants is an important step when processing large NGS datasets; however, currently, the process is complicated, repetitive, and requires concentration, which can be taxing on the researcher. Therefore, to support researchers who are not familiar enough with bioinformatics to identify sequence variations regularly from large datasets, we have developed a fully automated desktop software, NGSpop. NGSpop includes functionalities for all the variant calling and visualization procedures used when processing NGS data, such as quality control, mapping, filtering details, and variant calling. In the variant calling step, the user can select the GATK or DeepVariant algorithm for variant calling. These algorithms can be executed using pre-set pipelines and options or customized with the user-specified options. NGSpop is implemented using JavaFX (version 1.8) and can thus be run on Unix-like operating systems such as Ubuntu Linux (version 16.04, 18.0.4). Although several pipelines and visualization tools are available for NGS data analysis, most integrated environments do not support batch processes; thus, variant detection cannot be automated for population-level studies. The NGSpop software developed in this study has an easy-to-use interface and helps in rapid analysis of multiple NGS data from population studies. According to a benchmark test, it effectively reduced the carbon footprint in bioinformatics analysis by expending the least central processing unit heat and power. Additionally, this software makes it possible to use the GATK and DeepVariant algorithms more flexibly and efficiently than other programs by allowing users to choose between the algorithms. As a limitation, NGSpop currently supports only the sequencing reads in fastq format produced by the Illumina platform. NGSpop is freely available at https://sourceforge.net/projects/ngspop/. |
format | Online Article Text |
id | pubmed-9671411 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-96714112022-11-18 NGSpop: A desktop software that supports population studies by identifying sequence variations from next-generation sequencing data Lee, Dong-Jun Kwon, Taesoo Lee, Hye-Jin Oh, Yun-Ho Kim, Jin-Hyun Lee, Tae-Ho PLoS One Research Article Next-generation sequencing (NGS) is widely used in all areas of genetic research, such as genetic disease diagnosis and breeding, and it can produce massive amounts of data. The identification of sequence variants is an important step when processing large NGS datasets; however, currently, the process is complicated, repetitive, and requires concentration, which can be taxing on the researcher. Therefore, to support researchers who are not familiar enough with bioinformatics to identify sequence variations regularly from large datasets, we have developed a fully automated desktop software, NGSpop. NGSpop includes functionalities for all the variant calling and visualization procedures used when processing NGS data, such as quality control, mapping, filtering details, and variant calling. In the variant calling step, the user can select the GATK or DeepVariant algorithm for variant calling. These algorithms can be executed using pre-set pipelines and options or customized with the user-specified options. NGSpop is implemented using JavaFX (version 1.8) and can thus be run on Unix-like operating systems such as Ubuntu Linux (version 16.04, 18.0.4). Although several pipelines and visualization tools are available for NGS data analysis, most integrated environments do not support batch processes; thus, variant detection cannot be automated for population-level studies. The NGSpop software developed in this study has an easy-to-use interface and helps in rapid analysis of multiple NGS data from population studies. According to a benchmark test, it effectively reduced the carbon footprint in bioinformatics analysis by expending the least central processing unit heat and power. Additionally, this software makes it possible to use the GATK and DeepVariant algorithms more flexibly and efficiently than other programs by allowing users to choose between the algorithms. As a limitation, NGSpop currently supports only the sequencing reads in fastq format produced by the Illumina platform. NGSpop is freely available at https://sourceforge.net/projects/ngspop/. Public Library of Science 2022-11-17 /pmc/articles/PMC9671411/ /pubmed/36395265 http://dx.doi.org/10.1371/journal.pone.0260908 Text en © 2022 Lee et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Lee, Dong-Jun Kwon, Taesoo Lee, Hye-Jin Oh, Yun-Ho Kim, Jin-Hyun Lee, Tae-Ho NGSpop: A desktop software that supports population studies by identifying sequence variations from next-generation sequencing data |
title | NGSpop: A desktop software that supports population studies by identifying sequence variations from next-generation sequencing data |
title_full | NGSpop: A desktop software that supports population studies by identifying sequence variations from next-generation sequencing data |
title_fullStr | NGSpop: A desktop software that supports population studies by identifying sequence variations from next-generation sequencing data |
title_full_unstemmed | NGSpop: A desktop software that supports population studies by identifying sequence variations from next-generation sequencing data |
title_short | NGSpop: A desktop software that supports population studies by identifying sequence variations from next-generation sequencing data |
title_sort | ngspop: a desktop software that supports population studies by identifying sequence variations from next-generation sequencing data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9671411/ https://www.ncbi.nlm.nih.gov/pubmed/36395265 http://dx.doi.org/10.1371/journal.pone.0260908 |
work_keys_str_mv | AT leedongjun ngspopadesktopsoftwarethatsupportspopulationstudiesbyidentifyingsequencevariationsfromnextgenerationsequencingdata AT kwontaesoo ngspopadesktopsoftwarethatsupportspopulationstudiesbyidentifyingsequencevariationsfromnextgenerationsequencingdata AT leehyejin ngspopadesktopsoftwarethatsupportspopulationstudiesbyidentifyingsequencevariationsfromnextgenerationsequencingdata AT ohyunho ngspopadesktopsoftwarethatsupportspopulationstudiesbyidentifyingsequencevariationsfromnextgenerationsequencingdata AT kimjinhyun ngspopadesktopsoftwarethatsupportspopulationstudiesbyidentifyingsequencevariationsfromnextgenerationsequencingdata AT leetaeho ngspopadesktopsoftwarethatsupportspopulationstudiesbyidentifyingsequencevariationsfromnextgenerationsequencingdata |