Cargando…

Identification of natural selection in genomic data with deep convolutional neural network

BACKGROUND: With the increase in the size of genomic datasets describing variability in populations, extracting relevant information becomes increasingly useful as well as complex. Recently, computational methodologies such as Supervised Machine Learning and specifically Convolutional Neural Network...

Descripción completa

Detalles Bibliográficos
Autores principales: Nguembang Fadja, Arnaud, Riguzzi, Fabrizio, Bertorelle, Giorgio, Trucchi, Emiliano
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8642854/
https://www.ncbi.nlm.nih.gov/pubmed/34863217
http://dx.doi.org/10.1186/s13040-021-00280-9
_version_ 1784609757609328640
author Nguembang Fadja, Arnaud
Riguzzi, Fabrizio
Bertorelle, Giorgio
Trucchi, Emiliano
author_facet Nguembang Fadja, Arnaud
Riguzzi, Fabrizio
Bertorelle, Giorgio
Trucchi, Emiliano
author_sort Nguembang Fadja, Arnaud
collection PubMed
description BACKGROUND: With the increase in the size of genomic datasets describing variability in populations, extracting relevant information becomes increasingly useful as well as complex. Recently, computational methodologies such as Supervised Machine Learning and specifically Convolutional Neural Networks have been proposed to make inferences on demographic and adaptive processes using genomic data. Even though it was already shown to be powerful and efficient in different fields of investigation, Supervised Machine Learning has still to be explored as to unfold its enormous potential in evolutionary genomics. RESULTS: The paper proposes a method based on Supervised Machine Learning for classifying genomic data, represented as windows of genomic sequences from a sample of individuals belonging to the same population. A Convolutional Neural Network is used to test whether a genomic window shows the signature of natural selection. Training performed on simulated data show that the proposed model can accurately predict neutral and selection processes on portions of genomes taken from real populations with almost 90% accuracy.
format Online
Article
Text
id pubmed-8642854
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-86428542021-12-06 Identification of natural selection in genomic data with deep convolutional neural network Nguembang Fadja, Arnaud Riguzzi, Fabrizio Bertorelle, Giorgio Trucchi, Emiliano BioData Min Research BACKGROUND: With the increase in the size of genomic datasets describing variability in populations, extracting relevant information becomes increasingly useful as well as complex. Recently, computational methodologies such as Supervised Machine Learning and specifically Convolutional Neural Networks have been proposed to make inferences on demographic and adaptive processes using genomic data. Even though it was already shown to be powerful and efficient in different fields of investigation, Supervised Machine Learning has still to be explored as to unfold its enormous potential in evolutionary genomics. RESULTS: The paper proposes a method based on Supervised Machine Learning for classifying genomic data, represented as windows of genomic sequences from a sample of individuals belonging to the same population. A Convolutional Neural Network is used to test whether a genomic window shows the signature of natural selection. Training performed on simulated data show that the proposed model can accurately predict neutral and selection processes on portions of genomes taken from real populations with almost 90% accuracy. BioMed Central 2021-12-04 /pmc/articles/PMC8642854/ /pubmed/34863217 http://dx.doi.org/10.1186/s13040-021-00280-9 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Nguembang Fadja, Arnaud
Riguzzi, Fabrizio
Bertorelle, Giorgio
Trucchi, Emiliano
Identification of natural selection in genomic data with deep convolutional neural network
title Identification of natural selection in genomic data with deep convolutional neural network
title_full Identification of natural selection in genomic data with deep convolutional neural network
title_fullStr Identification of natural selection in genomic data with deep convolutional neural network
title_full_unstemmed Identification of natural selection in genomic data with deep convolutional neural network
title_short Identification of natural selection in genomic data with deep convolutional neural network
title_sort identification of natural selection in genomic data with deep convolutional neural network
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8642854/
https://www.ncbi.nlm.nih.gov/pubmed/34863217
http://dx.doi.org/10.1186/s13040-021-00280-9
work_keys_str_mv AT nguembangfadjaarnaud identificationofnaturalselectioningenomicdatawithdeepconvolutionalneuralnetwork
AT riguzzifabrizio identificationofnaturalselectioningenomicdatawithdeepconvolutionalneuralnetwork
AT bertorellegiorgio identificationofnaturalselectioningenomicdatawithdeepconvolutionalneuralnetwork
AT trucchiemiliano identificationofnaturalselectioningenomicdatawithdeepconvolutionalneuralnetwork