Cargando…

ngsJulia: population genetic analysis of next-generation DNA sequencing data with Julia language

A sound analysis of DNA sequencing data is important to extract meaningful information and infer quantities of interest. Sequencing and mapping errors coupled with low and variable coverage hamper the identification of genotypes and variants and the estimation of population genetic parameters. Metho...

Descripción completa

Detalles Bibliográficos
Autores principales: Mas-Sandoval, Alex, Jin, Chenyu, Fracassetti, Marco, Fumagalli, Matteo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000 Research Limited 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10514575/
https://www.ncbi.nlm.nih.gov/pubmed/37745626
http://dx.doi.org/10.12688/f1000research.104368.3
_version_ 1785108754282315776
author Mas-Sandoval, Alex
Jin, Chenyu
Fracassetti, Marco
Fumagalli, Matteo
author_facet Mas-Sandoval, Alex
Jin, Chenyu
Fracassetti, Marco
Fumagalli, Matteo
author_sort Mas-Sandoval, Alex
collection PubMed
description A sound analysis of DNA sequencing data is important to extract meaningful information and infer quantities of interest. Sequencing and mapping errors coupled with low and variable coverage hamper the identification of genotypes and variants and the estimation of population genetic parameters. Methods and implementations to estimate population genetic parameters from sequencing data available nowadays either are suitable for the analysis of genomes from model organisms only, require moderate sequencing coverage, or are not easily adaptable to specific applications. To address these issues, we introduce ngsJulia, a collection of templates and functions in Julia language to process short-read sequencing data for population genetic analysis. We further describe two implementations, ngsPool and ngsPloidy, for the analysis of pooled sequencing data and polyploid genomes, respectively. Through simulations, we illustrate the performance of estimating various population genetic parameters using these implementations, using both established and novel statistical methods. These results inform on optimal experimental design and demonstrate the applicability of methods in ngsJulia to estimate parameters of interest even from low coverage sequencing data. ngsJulia provide users with a flexible and efficient framework for ad hoc analysis of sequencing data.ngsJulia is available from: https://github.com/mfumagalli/ngsJulia
format Online
Article
Text
id pubmed-10514575
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher F1000 Research Limited
record_format MEDLINE/PubMed
spelling pubmed-105145752023-09-23 ngsJulia: population genetic analysis of next-generation DNA sequencing data with Julia language Mas-Sandoval, Alex Jin, Chenyu Fracassetti, Marco Fumagalli, Matteo F1000Res Software Tool Article A sound analysis of DNA sequencing data is important to extract meaningful information and infer quantities of interest. Sequencing and mapping errors coupled with low and variable coverage hamper the identification of genotypes and variants and the estimation of population genetic parameters. Methods and implementations to estimate population genetic parameters from sequencing data available nowadays either are suitable for the analysis of genomes from model organisms only, require moderate sequencing coverage, or are not easily adaptable to specific applications. To address these issues, we introduce ngsJulia, a collection of templates and functions in Julia language to process short-read sequencing data for population genetic analysis. We further describe two implementations, ngsPool and ngsPloidy, for the analysis of pooled sequencing data and polyploid genomes, respectively. Through simulations, we illustrate the performance of estimating various population genetic parameters using these implementations, using both established and novel statistical methods. These results inform on optimal experimental design and demonstrate the applicability of methods in ngsJulia to estimate parameters of interest even from low coverage sequencing data. ngsJulia provide users with a flexible and efficient framework for ad hoc analysis of sequencing data.ngsJulia is available from: https://github.com/mfumagalli/ngsJulia F1000 Research Limited 2023-07-14 /pmc/articles/PMC10514575/ /pubmed/37745626 http://dx.doi.org/10.12688/f1000research.104368.3 Text en Copyright: © 2023 Mas-Sandoval A et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software Tool Article
Mas-Sandoval, Alex
Jin, Chenyu
Fracassetti, Marco
Fumagalli, Matteo
ngsJulia: population genetic analysis of next-generation DNA sequencing data with Julia language
title ngsJulia: population genetic analysis of next-generation DNA sequencing data with Julia language
title_full ngsJulia: population genetic analysis of next-generation DNA sequencing data with Julia language
title_fullStr ngsJulia: population genetic analysis of next-generation DNA sequencing data with Julia language
title_full_unstemmed ngsJulia: population genetic analysis of next-generation DNA sequencing data with Julia language
title_short ngsJulia: population genetic analysis of next-generation DNA sequencing data with Julia language
title_sort ngsjulia: population genetic analysis of next-generation dna sequencing data with julia language
topic Software Tool Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10514575/
https://www.ncbi.nlm.nih.gov/pubmed/37745626
http://dx.doi.org/10.12688/f1000research.104368.3
work_keys_str_mv AT massandovalalex ngsjuliapopulationgeneticanalysisofnextgenerationdnasequencingdatawithjulialanguage
AT jinchenyu ngsjuliapopulationgeneticanalysisofnextgenerationdnasequencingdatawithjulialanguage
AT fracassettimarco ngsjuliapopulationgeneticanalysisofnextgenerationdnasequencingdatawithjulialanguage
AT fumagallimatteo ngsjuliapopulationgeneticanalysisofnextgenerationdnasequencingdatawithjulialanguage