Cargando…
Identification of natural selection in genomic data with deep convolutional neural network
BACKGROUND: With the increase in the size of genomic datasets describing variability in populations, extracting relevant information becomes increasingly useful as well as complex. Recently, computational methodologies such as Supervised Machine Learning and specifically Convolutional Neural Network...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8642854/ https://www.ncbi.nlm.nih.gov/pubmed/34863217 http://dx.doi.org/10.1186/s13040-021-00280-9 |
_version_ | 1784609757609328640 |
---|---|
author | Nguembang Fadja, Arnaud Riguzzi, Fabrizio Bertorelle, Giorgio Trucchi, Emiliano |
author_facet | Nguembang Fadja, Arnaud Riguzzi, Fabrizio Bertorelle, Giorgio Trucchi, Emiliano |
author_sort | Nguembang Fadja, Arnaud |
collection | PubMed |
description | BACKGROUND: With the increase in the size of genomic datasets describing variability in populations, extracting relevant information becomes increasingly useful as well as complex. Recently, computational methodologies such as Supervised Machine Learning and specifically Convolutional Neural Networks have been proposed to make inferences on demographic and adaptive processes using genomic data. Even though it was already shown to be powerful and efficient in different fields of investigation, Supervised Machine Learning has still to be explored as to unfold its enormous potential in evolutionary genomics. RESULTS: The paper proposes a method based on Supervised Machine Learning for classifying genomic data, represented as windows of genomic sequences from a sample of individuals belonging to the same population. A Convolutional Neural Network is used to test whether a genomic window shows the signature of natural selection. Training performed on simulated data show that the proposed model can accurately predict neutral and selection processes on portions of genomes taken from real populations with almost 90% accuracy. |
format | Online Article Text |
id | pubmed-8642854 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-86428542021-12-06 Identification of natural selection in genomic data with deep convolutional neural network Nguembang Fadja, Arnaud Riguzzi, Fabrizio Bertorelle, Giorgio Trucchi, Emiliano BioData Min Research BACKGROUND: With the increase in the size of genomic datasets describing variability in populations, extracting relevant information becomes increasingly useful as well as complex. Recently, computational methodologies such as Supervised Machine Learning and specifically Convolutional Neural Networks have been proposed to make inferences on demographic and adaptive processes using genomic data. Even though it was already shown to be powerful and efficient in different fields of investigation, Supervised Machine Learning has still to be explored as to unfold its enormous potential in evolutionary genomics. RESULTS: The paper proposes a method based on Supervised Machine Learning for classifying genomic data, represented as windows of genomic sequences from a sample of individuals belonging to the same population. A Convolutional Neural Network is used to test whether a genomic window shows the signature of natural selection. Training performed on simulated data show that the proposed model can accurately predict neutral and selection processes on portions of genomes taken from real populations with almost 90% accuracy. BioMed Central 2021-12-04 /pmc/articles/PMC8642854/ /pubmed/34863217 http://dx.doi.org/10.1186/s13040-021-00280-9 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Nguembang Fadja, Arnaud Riguzzi, Fabrizio Bertorelle, Giorgio Trucchi, Emiliano Identification of natural selection in genomic data with deep convolutional neural network |
title | Identification of natural selection in genomic data with deep convolutional neural network |
title_full | Identification of natural selection in genomic data with deep convolutional neural network |
title_fullStr | Identification of natural selection in genomic data with deep convolutional neural network |
title_full_unstemmed | Identification of natural selection in genomic data with deep convolutional neural network |
title_short | Identification of natural selection in genomic data with deep convolutional neural network |
title_sort | identification of natural selection in genomic data with deep convolutional neural network |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8642854/ https://www.ncbi.nlm.nih.gov/pubmed/34863217 http://dx.doi.org/10.1186/s13040-021-00280-9 |
work_keys_str_mv | AT nguembangfadjaarnaud identificationofnaturalselectioningenomicdatawithdeepconvolutionalneuralnetwork AT riguzzifabrizio identificationofnaturalselectioningenomicdatawithdeepconvolutionalneuralnetwork AT bertorellegiorgio identificationofnaturalselectioningenomicdatawithdeepconvolutionalneuralnetwork AT trucchiemiliano identificationofnaturalselectioningenomicdatawithdeepconvolutionalneuralnetwork |