Cargando…

A Pipeline for Phasing and Genotype Imputation on Mixed Human Data (Parents-Offspring Trios and Unrelated Subjects) by Reviewing Current Methods and Software

Genotype imputation has become an essential prerequisite when performing association analysis. It is a computational technique that allows us to infer genetic markers that have not been directly genotyped, thereby increasing statistical power in subsequent association studies, which consequently has...

Descripción completa

Detalles Bibliográficos
Autores principales: Baldrighi, Giulia Nicole, Nova, Andrea, Bernardinelli, Luisa, Fazia, Teresa
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9781110/
https://www.ncbi.nlm.nih.gov/pubmed/36556394
http://dx.doi.org/10.3390/life12122030
_version_ 1784856993513603072
author Baldrighi, Giulia Nicole
Nova, Andrea
Bernardinelli, Luisa
Fazia, Teresa
author_facet Baldrighi, Giulia Nicole
Nova, Andrea
Bernardinelli, Luisa
Fazia, Teresa
author_sort Baldrighi, Giulia Nicole
collection PubMed
description Genotype imputation has become an essential prerequisite when performing association analysis. It is a computational technique that allows us to infer genetic markers that have not been directly genotyped, thereby increasing statistical power in subsequent association studies, which consequently has a crucial impact on the identification of causal variants. Many features need to be considered when choosing the proper algorithm for imputation, including the target sample on which it is performed, i.e., related individuals, unrelated individuals, or both. Problems could arise when dealing with a target sample made up of mixed data, composed of both related and unrelated individuals, especially since the scientific literature on this topic is not sufficiently clear. To shed light on this issue, we examined existing algorithms and software for performing phasing and imputation on mixed human data from SNP arrays, specifically when related subjects belong to trios. By discussing the advantages and limitations of the current algorithms, we identified LD-based methods as being the most suitable for reconstruction of haplotypes in this specific context, and we proposed a feasible pipeline that can be used for imputing genotypes in both phased and unphased human data.
format Online
Article
Text
id pubmed-9781110
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-97811102022-12-24 A Pipeline for Phasing and Genotype Imputation on Mixed Human Data (Parents-Offspring Trios and Unrelated Subjects) by Reviewing Current Methods and Software Baldrighi, Giulia Nicole Nova, Andrea Bernardinelli, Luisa Fazia, Teresa Life (Basel) Review Genotype imputation has become an essential prerequisite when performing association analysis. It is a computational technique that allows us to infer genetic markers that have not been directly genotyped, thereby increasing statistical power in subsequent association studies, which consequently has a crucial impact on the identification of causal variants. Many features need to be considered when choosing the proper algorithm for imputation, including the target sample on which it is performed, i.e., related individuals, unrelated individuals, or both. Problems could arise when dealing with a target sample made up of mixed data, composed of both related and unrelated individuals, especially since the scientific literature on this topic is not sufficiently clear. To shed light on this issue, we examined existing algorithms and software for performing phasing and imputation on mixed human data from SNP arrays, specifically when related subjects belong to trios. By discussing the advantages and limitations of the current algorithms, we identified LD-based methods as being the most suitable for reconstruction of haplotypes in this specific context, and we proposed a feasible pipeline that can be used for imputing genotypes in both phased and unphased human data. MDPI 2022-12-05 /pmc/articles/PMC9781110/ /pubmed/36556394 http://dx.doi.org/10.3390/life12122030 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Review
Baldrighi, Giulia Nicole
Nova, Andrea
Bernardinelli, Luisa
Fazia, Teresa
A Pipeline for Phasing and Genotype Imputation on Mixed Human Data (Parents-Offspring Trios and Unrelated Subjects) by Reviewing Current Methods and Software
title A Pipeline for Phasing and Genotype Imputation on Mixed Human Data (Parents-Offspring Trios and Unrelated Subjects) by Reviewing Current Methods and Software
title_full A Pipeline for Phasing and Genotype Imputation on Mixed Human Data (Parents-Offspring Trios and Unrelated Subjects) by Reviewing Current Methods and Software
title_fullStr A Pipeline for Phasing and Genotype Imputation on Mixed Human Data (Parents-Offspring Trios and Unrelated Subjects) by Reviewing Current Methods and Software
title_full_unstemmed A Pipeline for Phasing and Genotype Imputation on Mixed Human Data (Parents-Offspring Trios and Unrelated Subjects) by Reviewing Current Methods and Software
title_short A Pipeline for Phasing and Genotype Imputation on Mixed Human Data (Parents-Offspring Trios and Unrelated Subjects) by Reviewing Current Methods and Software
title_sort pipeline for phasing and genotype imputation on mixed human data (parents-offspring trios and unrelated subjects) by reviewing current methods and software
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9781110/
https://www.ncbi.nlm.nih.gov/pubmed/36556394
http://dx.doi.org/10.3390/life12122030
work_keys_str_mv AT baldrighigiulianicole apipelineforphasingandgenotypeimputationonmixedhumandataparentsoffspringtriosandunrelatedsubjectsbyreviewingcurrentmethodsandsoftware
AT novaandrea apipelineforphasingandgenotypeimputationonmixedhumandataparentsoffspringtriosandunrelatedsubjectsbyreviewingcurrentmethodsandsoftware
AT bernardinelliluisa apipelineforphasingandgenotypeimputationonmixedhumandataparentsoffspringtriosandunrelatedsubjectsbyreviewingcurrentmethodsandsoftware
AT faziateresa apipelineforphasingandgenotypeimputationonmixedhumandataparentsoffspringtriosandunrelatedsubjectsbyreviewingcurrentmethodsandsoftware
AT baldrighigiulianicole pipelineforphasingandgenotypeimputationonmixedhumandataparentsoffspringtriosandunrelatedsubjectsbyreviewingcurrentmethodsandsoftware
AT novaandrea pipelineforphasingandgenotypeimputationonmixedhumandataparentsoffspringtriosandunrelatedsubjectsbyreviewingcurrentmethodsandsoftware
AT bernardinelliluisa pipelineforphasingandgenotypeimputationonmixedhumandataparentsoffspringtriosandunrelatedsubjectsbyreviewingcurrentmethodsandsoftware
AT faziateresa pipelineforphasingandgenotypeimputationonmixedhumandataparentsoffspringtriosandunrelatedsubjectsbyreviewingcurrentmethodsandsoftware