Cargando…

SNPtotree—Resolving the Phylogeny of SNPs on Non-Recombining DNA

Genetic variants on non-recombining DNA and the hierarchical order in which they accumulate are commonly of interest. This variant hierarchy can be established and combined with information on the population and geographic origin of the individuals carrying the variants to find population structures...

Descripción completa

Detalles Bibliográficos
Autores principales: Köksal, Zehra, Børsting, Claus, Gusmão, Leonor, Pereira, Vania
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10606150/
https://www.ncbi.nlm.nih.gov/pubmed/37895186
http://dx.doi.org/10.3390/genes14101837
_version_ 1785127247715237888
author Köksal, Zehra
Børsting, Claus
Gusmão, Leonor
Pereira, Vania
author_facet Köksal, Zehra
Børsting, Claus
Gusmão, Leonor
Pereira, Vania
author_sort Köksal, Zehra
collection PubMed
description Genetic variants on non-recombining DNA and the hierarchical order in which they accumulate are commonly of interest. This variant hierarchy can be established and combined with information on the population and geographic origin of the individuals carrying the variants to find population structures and infer migration patterns. Further, individuals can be assigned to the characterized populations, which is relevant in forensic genetics, genetic genealogy, and epidemiologic studies. However, there is currently no straightforward method to obtain such a variant hierarchy. Here, we introduce the software SNPtotree v1.0, which uniquely determines the hierarchical order of variants on non-recombining DNA without error-prone manual sorting. The algorithm uses pairwise variant comparisons to infer their relationships and integrates the combined information into a phylogenetic tree. Variants that have contradictory pairwise relationships or ambiguous positions in the tree are removed by the software. When benchmarked using two human Y-chromosomal massively parallel sequencing datasets, SNPtotree outperforms traditional methods in the accuracy of phylogenetic trees for sequencing data with high amounts of missing information. The phylogenetic trees of variants created using SNPtotree can be used to establish and maintain publicly available phylogeny databases to further explore genetic epidemiology and genealogy, as well as population and forensic genetics.
format Online
Article
Text
id pubmed-10606150
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-106061502023-10-28 SNPtotree—Resolving the Phylogeny of SNPs on Non-Recombining DNA Köksal, Zehra Børsting, Claus Gusmão, Leonor Pereira, Vania Genes (Basel) Article Genetic variants on non-recombining DNA and the hierarchical order in which they accumulate are commonly of interest. This variant hierarchy can be established and combined with information on the population and geographic origin of the individuals carrying the variants to find population structures and infer migration patterns. Further, individuals can be assigned to the characterized populations, which is relevant in forensic genetics, genetic genealogy, and epidemiologic studies. However, there is currently no straightforward method to obtain such a variant hierarchy. Here, we introduce the software SNPtotree v1.0, which uniquely determines the hierarchical order of variants on non-recombining DNA without error-prone manual sorting. The algorithm uses pairwise variant comparisons to infer their relationships and integrates the combined information into a phylogenetic tree. Variants that have contradictory pairwise relationships or ambiguous positions in the tree are removed by the software. When benchmarked using two human Y-chromosomal massively parallel sequencing datasets, SNPtotree outperforms traditional methods in the accuracy of phylogenetic trees for sequencing data with high amounts of missing information. The phylogenetic trees of variants created using SNPtotree can be used to establish and maintain publicly available phylogeny databases to further explore genetic epidemiology and genealogy, as well as population and forensic genetics. MDPI 2023-09-22 /pmc/articles/PMC10606150/ /pubmed/37895186 http://dx.doi.org/10.3390/genes14101837 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Köksal, Zehra
Børsting, Claus
Gusmão, Leonor
Pereira, Vania
SNPtotree—Resolving the Phylogeny of SNPs on Non-Recombining DNA
title SNPtotree—Resolving the Phylogeny of SNPs on Non-Recombining DNA
title_full SNPtotree—Resolving the Phylogeny of SNPs on Non-Recombining DNA
title_fullStr SNPtotree—Resolving the Phylogeny of SNPs on Non-Recombining DNA
title_full_unstemmed SNPtotree—Resolving the Phylogeny of SNPs on Non-Recombining DNA
title_short SNPtotree—Resolving the Phylogeny of SNPs on Non-Recombining DNA
title_sort snptotree—resolving the phylogeny of snps on non-recombining dna
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10606150/
https://www.ncbi.nlm.nih.gov/pubmed/37895186
http://dx.doi.org/10.3390/genes14101837
work_keys_str_mv AT koksalzehra snptotreeresolvingthephylogenyofsnpsonnonrecombiningdna
AT børstingclaus snptotreeresolvingthephylogenyofsnpsonnonrecombiningdna
AT gusmaoleonor snptotreeresolvingthephylogenyofsnpsonnonrecombiningdna
AT pereiravania snptotreeresolvingthephylogenyofsnpsonnonrecombiningdna