Cargando…

Systematic Review on Local Ancestor Inference From a Mathematical and Algorithmic Perspective

Genotypic data provide deep insights into the population history and medical genetics. The local ancestry inference (LAI) (also termed local ancestry deconvolution) method uses the hidden Markov model (HMM) to solve the mathematical problem of ancestry reconstruction based on genomic data. HMM is co...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Jie, Liu, Yangxiu, Zhao, Yiqiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8181461/
https://www.ncbi.nlm.nih.gov/pubmed/34108987
http://dx.doi.org/10.3389/fgene.2021.639877
_version_ 1783704096822263808
author Wu, Jie
Liu, Yangxiu
Zhao, Yiqiang
author_facet Wu, Jie
Liu, Yangxiu
Zhao, Yiqiang
author_sort Wu, Jie
collection PubMed
description Genotypic data provide deep insights into the population history and medical genetics. The local ancestry inference (LAI) (also termed local ancestry deconvolution) method uses the hidden Markov model (HMM) to solve the mathematical problem of ancestry reconstruction based on genomic data. HMM is combined with other statistical models and machine learning techniques for particular genetic tasks in a series of computer tools. In this article, we surveyed the mathematical structure, application characteristics, historical development, and benchmark analysis of the LAI method in detail, which will help researchers better understand and further develop LAI methods. Firstly, we extensively explore the mathematical structure of each model and its characteristic applications. Next, we use bibliometrics to show detailed model application fields and list articles to elaborate on the historical development. LAI publications had experienced a peak period during 2006–2016 and had kept on moving in the following years. The efficiency, accuracy, and stability of the existing models were evaluated by the benchmark. We find that phased data had higher accuracy in comparison with unphased data. We summarize these models with their distinct advantages and disadvantages. The Loter model uses dynamic programming to obtain a globally optimal solution with its parameter-free advantage. Aligned bases can be used directly in the Seqmix model if the genotype is hard to call. This research may help model developers to realize current challenges, develop more advanced models, and enable scholars to select appropriate models according to given populations and datasets.
format Online
Article
Text
id pubmed-8181461
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-81814612021-06-08 Systematic Review on Local Ancestor Inference From a Mathematical and Algorithmic Perspective Wu, Jie Liu, Yangxiu Zhao, Yiqiang Front Genet Genetics Genotypic data provide deep insights into the population history and medical genetics. The local ancestry inference (LAI) (also termed local ancestry deconvolution) method uses the hidden Markov model (HMM) to solve the mathematical problem of ancestry reconstruction based on genomic data. HMM is combined with other statistical models and machine learning techniques for particular genetic tasks in a series of computer tools. In this article, we surveyed the mathematical structure, application characteristics, historical development, and benchmark analysis of the LAI method in detail, which will help researchers better understand and further develop LAI methods. Firstly, we extensively explore the mathematical structure of each model and its characteristic applications. Next, we use bibliometrics to show detailed model application fields and list articles to elaborate on the historical development. LAI publications had experienced a peak period during 2006–2016 and had kept on moving in the following years. The efficiency, accuracy, and stability of the existing models were evaluated by the benchmark. We find that phased data had higher accuracy in comparison with unphased data. We summarize these models with their distinct advantages and disadvantages. The Loter model uses dynamic programming to obtain a globally optimal solution with its parameter-free advantage. Aligned bases can be used directly in the Seqmix model if the genotype is hard to call. This research may help model developers to realize current challenges, develop more advanced models, and enable scholars to select appropriate models according to given populations and datasets. Frontiers Media S.A. 2021-05-24 /pmc/articles/PMC8181461/ /pubmed/34108987 http://dx.doi.org/10.3389/fgene.2021.639877 Text en Copyright © 2021 Wu, Liu and Zhao. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Wu, Jie
Liu, Yangxiu
Zhao, Yiqiang
Systematic Review on Local Ancestor Inference From a Mathematical and Algorithmic Perspective
title Systematic Review on Local Ancestor Inference From a Mathematical and Algorithmic Perspective
title_full Systematic Review on Local Ancestor Inference From a Mathematical and Algorithmic Perspective
title_fullStr Systematic Review on Local Ancestor Inference From a Mathematical and Algorithmic Perspective
title_full_unstemmed Systematic Review on Local Ancestor Inference From a Mathematical and Algorithmic Perspective
title_short Systematic Review on Local Ancestor Inference From a Mathematical and Algorithmic Perspective
title_sort systematic review on local ancestor inference from a mathematical and algorithmic perspective
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8181461/
https://www.ncbi.nlm.nih.gov/pubmed/34108987
http://dx.doi.org/10.3389/fgene.2021.639877
work_keys_str_mv AT wujie systematicreviewonlocalancestorinferencefromamathematicalandalgorithmicperspective
AT liuyangxiu systematicreviewonlocalancestorinferencefromamathematicalandalgorithmicperspective
AT zhaoyiqiang systematicreviewonlocalancestorinferencefromamathematicalandalgorithmicperspective