Cargando…
ngsJulia: population genetic analysis of next-generation DNA sequencing data with Julia language
A sound analysis of DNA sequencing data is important to extract meaningful information and infer quantities of interest. Sequencing and mapping errors coupled with low and variable coverage hamper the identification of genotypes and variants and the estimation of population genetic parameters. Metho...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
F1000 Research Limited
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10514575/ https://www.ncbi.nlm.nih.gov/pubmed/37745626 http://dx.doi.org/10.12688/f1000research.104368.3 |
_version_ | 1785108754282315776 |
---|---|
author | Mas-Sandoval, Alex Jin, Chenyu Fracassetti, Marco Fumagalli, Matteo |
author_facet | Mas-Sandoval, Alex Jin, Chenyu Fracassetti, Marco Fumagalli, Matteo |
author_sort | Mas-Sandoval, Alex |
collection | PubMed |
description | A sound analysis of DNA sequencing data is important to extract meaningful information and infer quantities of interest. Sequencing and mapping errors coupled with low and variable coverage hamper the identification of genotypes and variants and the estimation of population genetic parameters. Methods and implementations to estimate population genetic parameters from sequencing data available nowadays either are suitable for the analysis of genomes from model organisms only, require moderate sequencing coverage, or are not easily adaptable to specific applications. To address these issues, we introduce ngsJulia, a collection of templates and functions in Julia language to process short-read sequencing data for population genetic analysis. We further describe two implementations, ngsPool and ngsPloidy, for the analysis of pooled sequencing data and polyploid genomes, respectively. Through simulations, we illustrate the performance of estimating various population genetic parameters using these implementations, using both established and novel statistical methods. These results inform on optimal experimental design and demonstrate the applicability of methods in ngsJulia to estimate parameters of interest even from low coverage sequencing data. ngsJulia provide users with a flexible and efficient framework for ad hoc analysis of sequencing data.ngsJulia is available from: https://github.com/mfumagalli/ngsJulia |
format | Online Article Text |
id | pubmed-10514575 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | F1000 Research Limited |
record_format | MEDLINE/PubMed |
spelling | pubmed-105145752023-09-23 ngsJulia: population genetic analysis of next-generation DNA sequencing data with Julia language Mas-Sandoval, Alex Jin, Chenyu Fracassetti, Marco Fumagalli, Matteo F1000Res Software Tool Article A sound analysis of DNA sequencing data is important to extract meaningful information and infer quantities of interest. Sequencing and mapping errors coupled with low and variable coverage hamper the identification of genotypes and variants and the estimation of population genetic parameters. Methods and implementations to estimate population genetic parameters from sequencing data available nowadays either are suitable for the analysis of genomes from model organisms only, require moderate sequencing coverage, or are not easily adaptable to specific applications. To address these issues, we introduce ngsJulia, a collection of templates and functions in Julia language to process short-read sequencing data for population genetic analysis. We further describe two implementations, ngsPool and ngsPloidy, for the analysis of pooled sequencing data and polyploid genomes, respectively. Through simulations, we illustrate the performance of estimating various population genetic parameters using these implementations, using both established and novel statistical methods. These results inform on optimal experimental design and demonstrate the applicability of methods in ngsJulia to estimate parameters of interest even from low coverage sequencing data. ngsJulia provide users with a flexible and efficient framework for ad hoc analysis of sequencing data.ngsJulia is available from: https://github.com/mfumagalli/ngsJulia F1000 Research Limited 2023-07-14 /pmc/articles/PMC10514575/ /pubmed/37745626 http://dx.doi.org/10.12688/f1000research.104368.3 Text en Copyright: © 2023 Mas-Sandoval A et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Software Tool Article Mas-Sandoval, Alex Jin, Chenyu Fracassetti, Marco Fumagalli, Matteo ngsJulia: population genetic analysis of next-generation DNA sequencing data with Julia language |
title | ngsJulia: population genetic analysis of next-generation DNA sequencing data with Julia language |
title_full | ngsJulia: population genetic analysis of next-generation DNA sequencing data with Julia language |
title_fullStr | ngsJulia: population genetic analysis of next-generation DNA sequencing data with Julia language |
title_full_unstemmed | ngsJulia: population genetic analysis of next-generation DNA sequencing data with Julia language |
title_short | ngsJulia: population genetic analysis of next-generation DNA sequencing data with Julia language |
title_sort | ngsjulia: population genetic analysis of next-generation dna sequencing data with julia language |
topic | Software Tool Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10514575/ https://www.ncbi.nlm.nih.gov/pubmed/37745626 http://dx.doi.org/10.12688/f1000research.104368.3 |
work_keys_str_mv | AT massandovalalex ngsjuliapopulationgeneticanalysisofnextgenerationdnasequencingdatawithjulialanguage AT jinchenyu ngsjuliapopulationgeneticanalysisofnextgenerationdnasequencingdatawithjulialanguage AT fracassettimarco ngsjuliapopulationgeneticanalysisofnextgenerationdnasequencingdatawithjulialanguage AT fumagallimatteo ngsjuliapopulationgeneticanalysisofnextgenerationdnasequencingdatawithjulialanguage |