Cargando…

Prediction of deleterious mutations in coding regions of mammals with transfer learning

The genomes of mammals contain thousands of deleterious mutations. It is important to be able to recognize them with high precision. In conservation biology, the small size of fragmented populations results in accumulation of damaging variants. Preserving animals with less damaged genomes could opti...

Descripción completa

Detalles Bibliográficos
Autores principales: Plekhanova, Elena, Nuzhdin, Sergey V., Utkin, Lev V., Samsonova, Maria G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6304693/
https://www.ncbi.nlm.nih.gov/pubmed/30622632
http://dx.doi.org/10.1111/eva.12607
_version_ 1783382416649355264
author Plekhanova, Elena
Nuzhdin, Sergey V.
Utkin, Lev V.
Samsonova, Maria G.
author_facet Plekhanova, Elena
Nuzhdin, Sergey V.
Utkin, Lev V.
Samsonova, Maria G.
author_sort Plekhanova, Elena
collection PubMed
description The genomes of mammals contain thousands of deleterious mutations. It is important to be able to recognize them with high precision. In conservation biology, the small size of fragmented populations results in accumulation of damaging variants. Preserving animals with less damaged genomes could optimize conservation efforts. In breeding of farm animals, trade‐offs between farm performance versus general fitness might be better avoided if deleterious mutations are well classified. In humans, the problem of such a precise classification has been successfully solved, in large part due to large databases of disease‐causing mutations. However, this kind of information is very limited for other mammals. Here, we propose to better use information available on human mutations to enable classification of damaging mutations in other mammalian species. Specifically, we apply transfer learning—machine learning methods—improving small dataset for solving a focal problem (recognizing damaging mutations in our companion and farm animals) due to the use of much large datasets available for solving a related problem (recognizing damaging mutations in humans). We validate our tools using mouse and dog annotated datasets and obtain significantly better results in companion to the SIFT classifier. Then, we apply them to predict deleterious mutations in cattle genomewide dataset.
format Online
Article
Text
id pubmed-6304693
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-63046932019-01-08 Prediction of deleterious mutations in coding regions of mammals with transfer learning Plekhanova, Elena Nuzhdin, Sergey V. Utkin, Lev V. Samsonova, Maria G. Evol Appl Special Issue Original Research Article The genomes of mammals contain thousands of deleterious mutations. It is important to be able to recognize them with high precision. In conservation biology, the small size of fragmented populations results in accumulation of damaging variants. Preserving animals with less damaged genomes could optimize conservation efforts. In breeding of farm animals, trade‐offs between farm performance versus general fitness might be better avoided if deleterious mutations are well classified. In humans, the problem of such a precise classification has been successfully solved, in large part due to large databases of disease‐causing mutations. However, this kind of information is very limited for other mammals. Here, we propose to better use information available on human mutations to enable classification of damaging mutations in other mammalian species. Specifically, we apply transfer learning—machine learning methods—improving small dataset for solving a focal problem (recognizing damaging mutations in our companion and farm animals) due to the use of much large datasets available for solving a related problem (recognizing damaging mutations in humans). We validate our tools using mouse and dog annotated datasets and obtain significantly better results in companion to the SIFT classifier. Then, we apply them to predict deleterious mutations in cattle genomewide dataset. John Wiley and Sons Inc. 2018-05-09 /pmc/articles/PMC6304693/ /pubmed/30622632 http://dx.doi.org/10.1111/eva.12607 Text en © 2018 The Authors. Evolutionary Applications published by John Wiley & Sons Ltd This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Special Issue Original Research Article
Plekhanova, Elena
Nuzhdin, Sergey V.
Utkin, Lev V.
Samsonova, Maria G.
Prediction of deleterious mutations in coding regions of mammals with transfer learning
title Prediction of deleterious mutations in coding regions of mammals with transfer learning
title_full Prediction of deleterious mutations in coding regions of mammals with transfer learning
title_fullStr Prediction of deleterious mutations in coding regions of mammals with transfer learning
title_full_unstemmed Prediction of deleterious mutations in coding regions of mammals with transfer learning
title_short Prediction of deleterious mutations in coding regions of mammals with transfer learning
title_sort prediction of deleterious mutations in coding regions of mammals with transfer learning
topic Special Issue Original Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6304693/
https://www.ncbi.nlm.nih.gov/pubmed/30622632
http://dx.doi.org/10.1111/eva.12607
work_keys_str_mv AT plekhanovaelena predictionofdeleteriousmutationsincodingregionsofmammalswithtransferlearning
AT nuzhdinsergeyv predictionofdeleteriousmutationsincodingregionsofmammalswithtransferlearning
AT utkinlevv predictionofdeleteriousmutationsincodingregionsofmammalswithtransferlearning
AT samsonovamariag predictionofdeleteriousmutationsincodingregionsofmammalswithtransferlearning