Cargando…

Bi-stream CNN Down Syndrome screening model based on genotyping array

BACKGROUND: Human Down syndrome (DS) is usually caused by genomic micro-duplications and dosage imbalances of human chromosome 21. It is associated with many genomic and phenotype abnormalities. Even though human DS occurs about 1 per 1,000 births worldwide, which is a very high rate, researchers ha...

Descripción completa

Detalles Bibliográficos
Autores principales: Feng, Bing, Hoskins, William, Zhang, Yan, Meng, Zibo, Samuels, David C., Wang, Jiandong, Xia, Ruofan, Liu, Chao, Tang, Jijun, Guo, Yan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6245487/
https://www.ncbi.nlm.nih.gov/pubmed/30453947
http://dx.doi.org/10.1186/s12920-018-0416-0
_version_ 1783372251971715072
author Feng, Bing
Hoskins, William
Zhang, Yan
Meng, Zibo
Samuels, David C.
Wang, Jiandong
Xia, Ruofan
Liu, Chao
Tang, Jijun
Guo, Yan
author_facet Feng, Bing
Hoskins, William
Zhang, Yan
Meng, Zibo
Samuels, David C.
Wang, Jiandong
Xia, Ruofan
Liu, Chao
Tang, Jijun
Guo, Yan
author_sort Feng, Bing
collection PubMed
description BACKGROUND: Human Down syndrome (DS) is usually caused by genomic micro-duplications and dosage imbalances of human chromosome 21. It is associated with many genomic and phenotype abnormalities. Even though human DS occurs about 1 per 1,000 births worldwide, which is a very high rate, researchers haven’t found any effective method to cure DS. Currently, the most efficient ways of human DS prevention are screening and early detection. METHODS: In this study, we used deep learning techniques and analyzed a set of Illumina genotyping array data. We built a bi-stream convolutional neural networks model to screen/predict the occurrence of DS. Firstly, we built image input data by converting the intensities of each SNP site into chromosome SNP maps. Next, we proposed a bi-stream convolutional neural network (CNN) architecture with nine layers and two branch models. We further merged two CNN branch models into one model in the fourth convolutional layer, and output the prediction in the last layer. RESULTS: Our bi-stream CNN model achieved 99.3% average accuracies, and very low false-positive and false-negative rates, which was necessary for further applications in disease prediction and medical practice. We further visualized the feature maps and learned filters from intermediate convolutional layers, which showed the genomic patterns and correlated SNPs variations in human DS genomes. We also compared our methods with other CNN and traditional machine learning models. We further analyzed and discussed the characteristics and strengths of our bi-stream CNN model. CONCLUSIONS: Our bi-stream model used two branch CNN models to learn the local genome features and regional patterns among adjacent genes and SNP sites from two chromosomes simultaneously. It achieved the best performance in all evaluating metrics when compared with two single-stream CNN models and three traditional machine-learning algorithms. The visualized feature maps also provided opportunities to study the genomic markers and pathway components associated with Human DS, which provided insights for gene therapy and genomic medicine developments.
format Online
Article
Text
id pubmed-6245487
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-62454872018-11-26 Bi-stream CNN Down Syndrome screening model based on genotyping array Feng, Bing Hoskins, William Zhang, Yan Meng, Zibo Samuels, David C. Wang, Jiandong Xia, Ruofan Liu, Chao Tang, Jijun Guo, Yan BMC Med Genomics Research BACKGROUND: Human Down syndrome (DS) is usually caused by genomic micro-duplications and dosage imbalances of human chromosome 21. It is associated with many genomic and phenotype abnormalities. Even though human DS occurs about 1 per 1,000 births worldwide, which is a very high rate, researchers haven’t found any effective method to cure DS. Currently, the most efficient ways of human DS prevention are screening and early detection. METHODS: In this study, we used deep learning techniques and analyzed a set of Illumina genotyping array data. We built a bi-stream convolutional neural networks model to screen/predict the occurrence of DS. Firstly, we built image input data by converting the intensities of each SNP site into chromosome SNP maps. Next, we proposed a bi-stream convolutional neural network (CNN) architecture with nine layers and two branch models. We further merged two CNN branch models into one model in the fourth convolutional layer, and output the prediction in the last layer. RESULTS: Our bi-stream CNN model achieved 99.3% average accuracies, and very low false-positive and false-negative rates, which was necessary for further applications in disease prediction and medical practice. We further visualized the feature maps and learned filters from intermediate convolutional layers, which showed the genomic patterns and correlated SNPs variations in human DS genomes. We also compared our methods with other CNN and traditional machine learning models. We further analyzed and discussed the characteristics and strengths of our bi-stream CNN model. CONCLUSIONS: Our bi-stream model used two branch CNN models to learn the local genome features and regional patterns among adjacent genes and SNP sites from two chromosomes simultaneously. It achieved the best performance in all evaluating metrics when compared with two single-stream CNN models and three traditional machine-learning algorithms. The visualized feature maps also provided opportunities to study the genomic markers and pathway components associated with Human DS, which provided insights for gene therapy and genomic medicine developments. BioMed Central 2018-11-20 /pmc/articles/PMC6245487/ /pubmed/30453947 http://dx.doi.org/10.1186/s12920-018-0416-0 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Feng, Bing
Hoskins, William
Zhang, Yan
Meng, Zibo
Samuels, David C.
Wang, Jiandong
Xia, Ruofan
Liu, Chao
Tang, Jijun
Guo, Yan
Bi-stream CNN Down Syndrome screening model based on genotyping array
title Bi-stream CNN Down Syndrome screening model based on genotyping array
title_full Bi-stream CNN Down Syndrome screening model based on genotyping array
title_fullStr Bi-stream CNN Down Syndrome screening model based on genotyping array
title_full_unstemmed Bi-stream CNN Down Syndrome screening model based on genotyping array
title_short Bi-stream CNN Down Syndrome screening model based on genotyping array
title_sort bi-stream cnn down syndrome screening model based on genotyping array
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6245487/
https://www.ncbi.nlm.nih.gov/pubmed/30453947
http://dx.doi.org/10.1186/s12920-018-0416-0
work_keys_str_mv AT fengbing bistreamcnndownsyndromescreeningmodelbasedongenotypingarray
AT hoskinswilliam bistreamcnndownsyndromescreeningmodelbasedongenotypingarray
AT zhangyan bistreamcnndownsyndromescreeningmodelbasedongenotypingarray
AT mengzibo bistreamcnndownsyndromescreeningmodelbasedongenotypingarray
AT samuelsdavidc bistreamcnndownsyndromescreeningmodelbasedongenotypingarray
AT wangjiandong bistreamcnndownsyndromescreeningmodelbasedongenotypingarray
AT xiaruofan bistreamcnndownsyndromescreeningmodelbasedongenotypingarray
AT liuchao bistreamcnndownsyndromescreeningmodelbasedongenotypingarray
AT tangjijun bistreamcnndownsyndromescreeningmodelbasedongenotypingarray
AT guoyan bistreamcnndownsyndromescreeningmodelbasedongenotypingarray