Cargando…
Comparison of multiple imputation algorithms and verification using whole-genome sequencing in the CMUH genetic biobank
A genome-wide association study (GWAS) can be conducted to systematically analyze the contributions of genetic factors to a wide variety of complex diseases. Nevertheless, existing GWASs have provided highly ethnic specific data. Accordingly, to provide data specific to Taiwan, we established a larg...
Autores principales: | , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
China Medical University
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8823485/ https://www.ncbi.nlm.nih.gov/pubmed/35223420 http://dx.doi.org/10.37796/2211-8039.1302 |
_version_ | 1784646811738177536 |
---|---|
author | Liu, Ting-Yuan Lin, Chih-Fan Wu, Hsing-Tsung Wu, Ya-Lun Chen, Yu-Chia Liao, Chi-Chou Chou, Yu-Pao Chao, Dysan Chang, Ya-Sian Lu, Hsing-Fang Chang, Jan-Gowth Hsu, Kai-Cheng Tsai, Fuu-Jen |
author_facet | Liu, Ting-Yuan Lin, Chih-Fan Wu, Hsing-Tsung Wu, Ya-Lun Chen, Yu-Chia Liao, Chi-Chou Chou, Yu-Pao Chao, Dysan Chang, Ya-Sian Lu, Hsing-Fang Chang, Jan-Gowth Hsu, Kai-Cheng Tsai, Fuu-Jen |
author_sort | Liu, Ting-Yuan |
collection | PubMed |
description | A genome-wide association study (GWAS) can be conducted to systematically analyze the contributions of genetic factors to a wide variety of complex diseases. Nevertheless, existing GWASs have provided highly ethnic specific data. Accordingly, to provide data specific to Taiwan, we established a large-scale genetic database in a single medical institution at the China Medical University Hospital. With current technological limitations, microarray analysis can detect only a limited number of single-nucleotide polymorphisms (SNPs) with a minor allele frequency of >1%. Nevertheless, imputation represents a useful alternative means of expanding data. In this study, we compared four imputation algorithms in terms of various metrics. We observed that among the compared algorithms, Beagle5.2 achieved the fastest calculation speed, smallest storage space, highest specificity, and highest number of high-quality variants. We obtained 15,277,414 high-quality variants in 175,871 people by using Beagle5.2. In our internal verification process, Beagle5.2 exhibited an accuracy rate of up to 98.75%. We also conducted external verification. Our imputed variants had a 79.91% mapping rate and 90.41% accuracy. These results will be combined with clinical data in future research. We have made the results available for researchers to use in formulating imputation algorithms, in addition to establishing a complete SNP database for GWAS and PRS researchers. We believe that these data can help improve overall medical capabilities, particularly precision medicine, in Taiwan. |
format | Online Article Text |
id | pubmed-8823485 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | China Medical University |
record_format | MEDLINE/PubMed |
spelling | pubmed-88234852022-02-25 Comparison of multiple imputation algorithms and verification using whole-genome sequencing in the CMUH genetic biobank Liu, Ting-Yuan Lin, Chih-Fan Wu, Hsing-Tsung Wu, Ya-Lun Chen, Yu-Chia Liao, Chi-Chou Chou, Yu-Pao Chao, Dysan Chang, Ya-Sian Lu, Hsing-Fang Chang, Jan-Gowth Hsu, Kai-Cheng Tsai, Fuu-Jen Biomedicine (Taipei) Original Article A genome-wide association study (GWAS) can be conducted to systematically analyze the contributions of genetic factors to a wide variety of complex diseases. Nevertheless, existing GWASs have provided highly ethnic specific data. Accordingly, to provide data specific to Taiwan, we established a large-scale genetic database in a single medical institution at the China Medical University Hospital. With current technological limitations, microarray analysis can detect only a limited number of single-nucleotide polymorphisms (SNPs) with a minor allele frequency of >1%. Nevertheless, imputation represents a useful alternative means of expanding data. In this study, we compared four imputation algorithms in terms of various metrics. We observed that among the compared algorithms, Beagle5.2 achieved the fastest calculation speed, smallest storage space, highest specificity, and highest number of high-quality variants. We obtained 15,277,414 high-quality variants in 175,871 people by using Beagle5.2. In our internal verification process, Beagle5.2 exhibited an accuracy rate of up to 98.75%. We also conducted external verification. Our imputed variants had a 79.91% mapping rate and 90.41% accuracy. These results will be combined with clinical data in future research. We have made the results available for researchers to use in formulating imputation algorithms, in addition to establishing a complete SNP database for GWAS and PRS researchers. We believe that these data can help improve overall medical capabilities, particularly precision medicine, in Taiwan. China Medical University 2021-12-01 /pmc/articles/PMC8823485/ /pubmed/35223420 http://dx.doi.org/10.37796/2211-8039.1302 Text en © the Author(s) https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ). |
spellingShingle | Original Article Liu, Ting-Yuan Lin, Chih-Fan Wu, Hsing-Tsung Wu, Ya-Lun Chen, Yu-Chia Liao, Chi-Chou Chou, Yu-Pao Chao, Dysan Chang, Ya-Sian Lu, Hsing-Fang Chang, Jan-Gowth Hsu, Kai-Cheng Tsai, Fuu-Jen Comparison of multiple imputation algorithms and verification using whole-genome sequencing in the CMUH genetic biobank |
title | Comparison of multiple imputation algorithms and verification using whole-genome sequencing in the CMUH genetic biobank |
title_full | Comparison of multiple imputation algorithms and verification using whole-genome sequencing in the CMUH genetic biobank |
title_fullStr | Comparison of multiple imputation algorithms and verification using whole-genome sequencing in the CMUH genetic biobank |
title_full_unstemmed | Comparison of multiple imputation algorithms and verification using whole-genome sequencing in the CMUH genetic biobank |
title_short | Comparison of multiple imputation algorithms and verification using whole-genome sequencing in the CMUH genetic biobank |
title_sort | comparison of multiple imputation algorithms and verification using whole-genome sequencing in the cmuh genetic biobank |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8823485/ https://www.ncbi.nlm.nih.gov/pubmed/35223420 http://dx.doi.org/10.37796/2211-8039.1302 |
work_keys_str_mv | AT liutingyuan comparisonofmultipleimputationalgorithmsandverificationusingwholegenomesequencinginthecmuhgeneticbiobank AT linchihfan comparisonofmultipleimputationalgorithmsandverificationusingwholegenomesequencinginthecmuhgeneticbiobank AT wuhsingtsung comparisonofmultipleimputationalgorithmsandverificationusingwholegenomesequencinginthecmuhgeneticbiobank AT wuyalun comparisonofmultipleimputationalgorithmsandverificationusingwholegenomesequencinginthecmuhgeneticbiobank AT chenyuchia comparisonofmultipleimputationalgorithmsandverificationusingwholegenomesequencinginthecmuhgeneticbiobank AT liaochichou comparisonofmultipleimputationalgorithmsandverificationusingwholegenomesequencinginthecmuhgeneticbiobank AT chouyupao comparisonofmultipleimputationalgorithmsandverificationusingwholegenomesequencinginthecmuhgeneticbiobank AT chaodysan comparisonofmultipleimputationalgorithmsandverificationusingwholegenomesequencinginthecmuhgeneticbiobank AT changyasian comparisonofmultipleimputationalgorithmsandverificationusingwholegenomesequencinginthecmuhgeneticbiobank AT luhsingfang comparisonofmultipleimputationalgorithmsandverificationusingwholegenomesequencinginthecmuhgeneticbiobank AT changjangowth comparisonofmultipleimputationalgorithmsandverificationusingwholegenomesequencinginthecmuhgeneticbiobank AT hsukaicheng comparisonofmultipleimputationalgorithmsandverificationusingwholegenomesequencinginthecmuhgeneticbiobank AT tsaifuujen comparisonofmultipleimputationalgorithmsandverificationusingwholegenomesequencinginthecmuhgeneticbiobank |