Cargando…
NGSpop: A desktop software that supports population studies by identifying sequence variations from next-generation sequencing data
Next-generation sequencing (NGS) is widely used in all areas of genetic research, such as genetic disease diagnosis and breeding, and it can produce massive amounts of data. The identification of sequence variants is an important step when processing large NGS datasets; however, currently, the proce...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9671411/ https://www.ncbi.nlm.nih.gov/pubmed/36395265 http://dx.doi.org/10.1371/journal.pone.0260908 |
Sumario: | Next-generation sequencing (NGS) is widely used in all areas of genetic research, such as genetic disease diagnosis and breeding, and it can produce massive amounts of data. The identification of sequence variants is an important step when processing large NGS datasets; however, currently, the process is complicated, repetitive, and requires concentration, which can be taxing on the researcher. Therefore, to support researchers who are not familiar enough with bioinformatics to identify sequence variations regularly from large datasets, we have developed a fully automated desktop software, NGSpop. NGSpop includes functionalities for all the variant calling and visualization procedures used when processing NGS data, such as quality control, mapping, filtering details, and variant calling. In the variant calling step, the user can select the GATK or DeepVariant algorithm for variant calling. These algorithms can be executed using pre-set pipelines and options or customized with the user-specified options. NGSpop is implemented using JavaFX (version 1.8) and can thus be run on Unix-like operating systems such as Ubuntu Linux (version 16.04, 18.0.4). Although several pipelines and visualization tools are available for NGS data analysis, most integrated environments do not support batch processes; thus, variant detection cannot be automated for population-level studies. The NGSpop software developed in this study has an easy-to-use interface and helps in rapid analysis of multiple NGS data from population studies. According to a benchmark test, it effectively reduced the carbon footprint in bioinformatics analysis by expending the least central processing unit heat and power. Additionally, this software makes it possible to use the GATK and DeepVariant algorithms more flexibly and efficiently than other programs by allowing users to choose between the algorithms. As a limitation, NGSpop currently supports only the sequencing reads in fastq format produced by the Illumina platform. NGSpop is freely available at https://sourceforge.net/projects/ngspop/. |
---|