Cargando…

Eye-color and Type-2 diabetes phenotype prediction from genotype data using deep learning methods

BACKGROUND: Genotype–phenotype predictions are of great importance in genetics. These predictions can help to find genetic mutations causing variations in human beings. There are many approaches for finding the association which can be broadly categorized into two classes, statistical techniques, an...

Descripción completa

Detalles Bibliográficos
Autores principales:	Muneeb, Muhammad, Henschel, Andreas
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2021
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8056510/ https://www.ncbi.nlm.nih.gov/pubmed/33874881 http://dx.doi.org/10.1186/s12859-021-04077-9

_version_	1783680661708603392
author	Muneeb, Muhammad Henschel, Andreas
author_facet	Muneeb, Muhammad Henschel, Andreas
author_sort	Muneeb, Muhammad
collection	PubMed
description	BACKGROUND: Genotype–phenotype predictions are of great importance in genetics. These predictions can help to find genetic mutations causing variations in human beings. There are many approaches for finding the association which can be broadly categorized into two classes, statistical techniques, and machine learning. Statistical techniques are good for finding the actual SNPs causing variation where Machine Learning techniques are good where we just want to classify the people into different categories. In this article, we examined the Eye-color and Type-2 diabetes phenotype. The proposed technique is a hybrid approach consisting of some parts from statistical techniques and remaining from Machine learning. RESULTS: The main dataset for Eye-color phenotype consists of 806 people. 404 people have Blue-Green eyes where 402 people have Brown eyes. After preprocessing we generated 8 different datasets, containing different numbers of SNPs, using the mutation difference and thresholding at individual SNP. We calculated three types of mutation at each SNP no mutation, partial mutation, and full mutation. After that data is transformed for machine learning algorithms. We used about 9 classifiers, RandomForest, Extreme Gradient boosting, ANN, LSTM, GRU, BILSTM, 1DCNN, ensembles of ANN, and ensembles of LSTM which gave the best accuracy of 0.91, 0.9286, 0.945, 0.94, 0.94, 0.92, 0.95, and 0.96% respectively. Stacked ensembles of LSTM outperformed other algorithms for 1560 SNPs with an overall accuracy of 0.96, AUC = 0.98 for brown eyes, and AUC = 0.97 for Blue-Green eyes. The main dataset for Type-2 diabetes consists of 107 people where 30 people are classified as cases and 74 people as controls. We used different linear threshold to find the optimal number of SNPs for classification. The final model gave an accuracy of 0.97%. CONCLUSION: Genotype–phenotype predictions are very useful especially in forensic. These predictions can help to identify SNP variant association with traits and diseases. Given more datasets, machine learning model predictions can be increased. Moreover, the non-linearity in the Machine learning model and the combination of SNPs Mutations while training the model increases the prediction. We considered binary classification problems but the proposed approach can be extended to multi-class classification.
format	Online Article Text
id	pubmed-8056510
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-80565102021-04-20 Eye-color and Type-2 diabetes phenotype prediction from genotype data using deep learning methods Muneeb, Muhammad Henschel, Andreas BMC Bioinformatics Research Article BACKGROUND: Genotype–phenotype predictions are of great importance in genetics. These predictions can help to find genetic mutations causing variations in human beings. There are many approaches for finding the association which can be broadly categorized into two classes, statistical techniques, and machine learning. Statistical techniques are good for finding the actual SNPs causing variation where Machine Learning techniques are good where we just want to classify the people into different categories. In this article, we examined the Eye-color and Type-2 diabetes phenotype. The proposed technique is a hybrid approach consisting of some parts from statistical techniques and remaining from Machine learning. RESULTS: The main dataset for Eye-color phenotype consists of 806 people. 404 people have Blue-Green eyes where 402 people have Brown eyes. After preprocessing we generated 8 different datasets, containing different numbers of SNPs, using the mutation difference and thresholding at individual SNP. We calculated three types of mutation at each SNP no mutation, partial mutation, and full mutation. After that data is transformed for machine learning algorithms. We used about 9 classifiers, RandomForest, Extreme Gradient boosting, ANN, LSTM, GRU, BILSTM, 1DCNN, ensembles of ANN, and ensembles of LSTM which gave the best accuracy of 0.91, 0.9286, 0.945, 0.94, 0.94, 0.92, 0.95, and 0.96% respectively. Stacked ensembles of LSTM outperformed other algorithms for 1560 SNPs with an overall accuracy of 0.96, AUC = 0.98 for brown eyes, and AUC = 0.97 for Blue-Green eyes. The main dataset for Type-2 diabetes consists of 107 people where 30 people are classified as cases and 74 people as controls. We used different linear threshold to find the optimal number of SNPs for classification. The final model gave an accuracy of 0.97%. CONCLUSION: Genotype–phenotype predictions are very useful especially in forensic. These predictions can help to identify SNP variant association with traits and diseases. Given more datasets, machine learning model predictions can be increased. Moreover, the non-linearity in the Machine learning model and the combination of SNPs Mutations while training the model increases the prediction. We considered binary classification problems but the proposed approach can be extended to multi-class classification. BioMed Central 2021-04-19 /pmc/articles/PMC8056510/ /pubmed/33874881 http://dx.doi.org/10.1186/s12859-021-04077-9 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Research Article Muneeb, Muhammad Henschel, Andreas Eye-color and Type-2 diabetes phenotype prediction from genotype data using deep learning methods
title	Eye-color and Type-2 diabetes phenotype prediction from genotype data using deep learning methods
title_full	Eye-color and Type-2 diabetes phenotype prediction from genotype data using deep learning methods
title_fullStr	Eye-color and Type-2 diabetes phenotype prediction from genotype data using deep learning methods
title_full_unstemmed	Eye-color and Type-2 diabetes phenotype prediction from genotype data using deep learning methods
title_short	Eye-color and Type-2 diabetes phenotype prediction from genotype data using deep learning methods
title_sort	eye-color and type-2 diabetes phenotype prediction from genotype data using deep learning methods
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8056510/ https://www.ncbi.nlm.nih.gov/pubmed/33874881 http://dx.doi.org/10.1186/s12859-021-04077-9
work_keys_str_mv	AT muneebmuhammad eyecolorandtype2diabetesphenotypepredictionfromgenotypedatausingdeeplearningmethods AT henschelandreas eyecolorandtype2diabetesphenotypepredictionfromgenotypedatausingdeeplearningmethods

Eye-color and Type-2 diabetes phenotype prediction from genotype data using deep learning methods

Ejemplares similares