Cargando…

Machine Learning Models for Genetic Risk Assessment of Infants with Non-syndromic Orofacial Cleft

The isolated type of orofacial cleft, termed non-syndromic cleft lip with or without cleft palate (NSCL/P), is the second most common birth defect in China, with Asians having the highest incidence in the world. NSCL/P involves multiple genes and complex interactions between genetic and environmenta...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Shi-Jian, Meng, Peiqi, Zhang, Jieni, Jia, Peizeng, Lin, Jiuxiang, Wang, Xiangfeng, Chen, Feng, Wei, Xiaoxing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6364041/
https://www.ncbi.nlm.nih.gov/pubmed/30578914
http://dx.doi.org/10.1016/j.gpb.2018.07.005
_version_ 1783393195963449344
author Zhang, Shi-Jian
Meng, Peiqi
Zhang, Jieni
Jia, Peizeng
Lin, Jiuxiang
Wang, Xiangfeng
Chen, Feng
Wei, Xiaoxing
author_facet Zhang, Shi-Jian
Meng, Peiqi
Zhang, Jieni
Jia, Peizeng
Lin, Jiuxiang
Wang, Xiangfeng
Chen, Feng
Wei, Xiaoxing
author_sort Zhang, Shi-Jian
collection PubMed
description The isolated type of orofacial cleft, termed non-syndromic cleft lip with or without cleft palate (NSCL/P), is the second most common birth defect in China, with Asians having the highest incidence in the world. NSCL/P involves multiple genes and complex interactions between genetic and environmental factors, imposing difficulty for the genetic assessment of the unborn fetus carrying multiple NSCL/P-susceptible variants. Although genome-wide association studies (GWAS) have uncovered dozens of single nucleotide polymorphism (SNP) loci in different ethnic populations, the genetic diagnostic effectiveness of these SNPs requires further experimental validation in Chinese populations before a diagnostic panel or a predictive model covering multiple SNPs can be built. In this study, we collected blood samples from control and NSCL/P infants in Han and Uyghur Chinese populations to validate the diagnostic effectiveness of 43 candidate SNPs previously detected using GWAS. We then built predictive models with the validated SNPs using different machine learning algorithms and evaluated their prediction performance. Our results showed that logistic regression had the best performance for risk assessment according to the area under curve. Notably, defective variants in MTHFR and RBP4, two genes involved in folic acid and vitamin A biosynthesis, were found to have high contributions to NSCL/P incidence based on feature importance evaluation with logistic regression. This is consistent with the notion that folic acid and vitamin A are both essential nutritional supplements for pregnant women to reduce the risk of conceiving an NSCL/P baby. Moreover, we observed a lower predictive power in Uyghur than in Han cases, likely due to differences in genetic background between these two ethnic populations. Thus, our study highlights the urgency to generate the HapMap for Uyghur population and perform resequencing-based screening of Uyghur-specific NSCL/P markers.
format Online
Article
Text
id pubmed-6364041
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-63640412019-02-15 Machine Learning Models for Genetic Risk Assessment of Infants with Non-syndromic Orofacial Cleft Zhang, Shi-Jian Meng, Peiqi Zhang, Jieni Jia, Peizeng Lin, Jiuxiang Wang, Xiangfeng Chen, Feng Wei, Xiaoxing Genomics Proteomics Bioinformatics Method The isolated type of orofacial cleft, termed non-syndromic cleft lip with or without cleft palate (NSCL/P), is the second most common birth defect in China, with Asians having the highest incidence in the world. NSCL/P involves multiple genes and complex interactions between genetic and environmental factors, imposing difficulty for the genetic assessment of the unborn fetus carrying multiple NSCL/P-susceptible variants. Although genome-wide association studies (GWAS) have uncovered dozens of single nucleotide polymorphism (SNP) loci in different ethnic populations, the genetic diagnostic effectiveness of these SNPs requires further experimental validation in Chinese populations before a diagnostic panel or a predictive model covering multiple SNPs can be built. In this study, we collected blood samples from control and NSCL/P infants in Han and Uyghur Chinese populations to validate the diagnostic effectiveness of 43 candidate SNPs previously detected using GWAS. We then built predictive models with the validated SNPs using different machine learning algorithms and evaluated their prediction performance. Our results showed that logistic regression had the best performance for risk assessment according to the area under curve. Notably, defective variants in MTHFR and RBP4, two genes involved in folic acid and vitamin A biosynthesis, were found to have high contributions to NSCL/P incidence based on feature importance evaluation with logistic regression. This is consistent with the notion that folic acid and vitamin A are both essential nutritional supplements for pregnant women to reduce the risk of conceiving an NSCL/P baby. Moreover, we observed a lower predictive power in Uyghur than in Han cases, likely due to differences in genetic background between these two ethnic populations. Thus, our study highlights the urgency to generate the HapMap for Uyghur population and perform resequencing-based screening of Uyghur-specific NSCL/P markers. Elsevier 2018-10 2018-12-19 /pmc/articles/PMC6364041/ /pubmed/30578914 http://dx.doi.org/10.1016/j.gpb.2018.07.005 Text en © 2018 The Authors http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Method
Zhang, Shi-Jian
Meng, Peiqi
Zhang, Jieni
Jia, Peizeng
Lin, Jiuxiang
Wang, Xiangfeng
Chen, Feng
Wei, Xiaoxing
Machine Learning Models for Genetic Risk Assessment of Infants with Non-syndromic Orofacial Cleft
title Machine Learning Models for Genetic Risk Assessment of Infants with Non-syndromic Orofacial Cleft
title_full Machine Learning Models for Genetic Risk Assessment of Infants with Non-syndromic Orofacial Cleft
title_fullStr Machine Learning Models for Genetic Risk Assessment of Infants with Non-syndromic Orofacial Cleft
title_full_unstemmed Machine Learning Models for Genetic Risk Assessment of Infants with Non-syndromic Orofacial Cleft
title_short Machine Learning Models for Genetic Risk Assessment of Infants with Non-syndromic Orofacial Cleft
title_sort machine learning models for genetic risk assessment of infants with non-syndromic orofacial cleft
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6364041/
https://www.ncbi.nlm.nih.gov/pubmed/30578914
http://dx.doi.org/10.1016/j.gpb.2018.07.005
work_keys_str_mv AT zhangshijian machinelearningmodelsforgeneticriskassessmentofinfantswithnonsyndromicorofacialcleft
AT mengpeiqi machinelearningmodelsforgeneticriskassessmentofinfantswithnonsyndromicorofacialcleft
AT zhangjieni machinelearningmodelsforgeneticriskassessmentofinfantswithnonsyndromicorofacialcleft
AT jiapeizeng machinelearningmodelsforgeneticriskassessmentofinfantswithnonsyndromicorofacialcleft
AT linjiuxiang machinelearningmodelsforgeneticriskassessmentofinfantswithnonsyndromicorofacialcleft
AT wangxiangfeng machinelearningmodelsforgeneticriskassessmentofinfantswithnonsyndromicorofacialcleft
AT chenfeng machinelearningmodelsforgeneticriskassessmentofinfantswithnonsyndromicorofacialcleft
AT weixiaoxing machinelearningmodelsforgeneticriskassessmentofinfantswithnonsyndromicorofacialcleft