Cargando…

Machine learning applied to transcriptomic data to identify genes associated with feed efficiency in pigs

BACKGROUND: To date, the molecular mechanisms that underlie residual feed intake (RFI) in pigs are unknown. Results from different genome-wide association studies and gene expression analyses are not always consistent. The aim of this research was to use machine learning to identify genes associated...

Descripción completa

Detalles Bibliográficos
Autores principales: Piles, Miriam, Fernandez-Lozano, Carlos, Velasco-Galilea, María, González-Rodríguez, Olga, Sánchez, Juan Pablo, Torrallardona, David, Ballester, Maria, Quintanilla, Raquel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6417084/
https://www.ncbi.nlm.nih.gov/pubmed/30866799
http://dx.doi.org/10.1186/s12711-019-0453-y
_version_ 1783403495422951424
author Piles, Miriam
Fernandez-Lozano, Carlos
Velasco-Galilea, María
González-Rodríguez, Olga
Sánchez, Juan Pablo
Torrallardona, David
Ballester, Maria
Quintanilla, Raquel
author_facet Piles, Miriam
Fernandez-Lozano, Carlos
Velasco-Galilea, María
González-Rodríguez, Olga
Sánchez, Juan Pablo
Torrallardona, David
Ballester, Maria
Quintanilla, Raquel
author_sort Piles, Miriam
collection PubMed
description BACKGROUND: To date, the molecular mechanisms that underlie residual feed intake (RFI) in pigs are unknown. Results from different genome-wide association studies and gene expression analyses are not always consistent. The aim of this research was to use machine learning to identify genes associated with feed efficiency (FE) using transcriptomic (RNA-Seq) data from pigs that are phenotypically extreme for RFI. METHODS: RFI was computed by considering within-sex regression on mean metabolic body weight, average daily gain, and average backfat gain. RNA-Seq analyses were performed on liver and duodenum tissue from 32 high and 33 low RFI pigs collected at 153 d of age. Machine-learning algorithms were used to predict RFI class based on gene expression levels in liver and duodenum after adjusting for batch effects. Genes were ranked according to their contribution to the classification using the permutation accuracy importance score in an unbiased random forest (RF) algorithm based on conditional inference. Support vector machine, RF, elastic net (ENET) and nearest shrunken centroid algorithms were tested using different subsets of the top rank genes. Nested resampling for hyperparameter tuning was implemented with tenfold cross-validation in the outer and inner loops. RESULTS: The best classification was obtained with ENET using the expression of 200 genes in liver [area under the receiver operating characteristic curve (AUROC): 0.85; accuracy: 0.78] and 100 genes in duodenum (AUROC: 0.76; accuracy: 0.69). Canonical pathways and candidate genes that were previously reported as associated with FE in several species were identified. The most remarkable pathways and genes identified were NRF2-mediated oxidative stress response and aldosterone signalling in epithelial cells, the DNAJC6, DNAJC1, MAPK8, PRKD3 genes in duodenum, and melatonin degradation II, PPARα/RXRα activation, and GPCR-mediated nutrient sensing in enteroendocrine cells and SMOX, IL4I1, PRKAR2B, CLOCK and CCK genes in liver. CONCLUSIONS: ML algorithms and RNA-Seq expression data were found to provide good performance for classifying pigs into high or low RFI groups. Classification was better with gene expression data from liver than from duodenum. Genes associated with FE in liver and duodenum tissue that can be used as predictive biomarkers for this trait were identified. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12711-019-0453-y) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6417084
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-64170842019-03-25 Machine learning applied to transcriptomic data to identify genes associated with feed efficiency in pigs Piles, Miriam Fernandez-Lozano, Carlos Velasco-Galilea, María González-Rodríguez, Olga Sánchez, Juan Pablo Torrallardona, David Ballester, Maria Quintanilla, Raquel Genet Sel Evol Research Article BACKGROUND: To date, the molecular mechanisms that underlie residual feed intake (RFI) in pigs are unknown. Results from different genome-wide association studies and gene expression analyses are not always consistent. The aim of this research was to use machine learning to identify genes associated with feed efficiency (FE) using transcriptomic (RNA-Seq) data from pigs that are phenotypically extreme for RFI. METHODS: RFI was computed by considering within-sex regression on mean metabolic body weight, average daily gain, and average backfat gain. RNA-Seq analyses were performed on liver and duodenum tissue from 32 high and 33 low RFI pigs collected at 153 d of age. Machine-learning algorithms were used to predict RFI class based on gene expression levels in liver and duodenum after adjusting for batch effects. Genes were ranked according to their contribution to the classification using the permutation accuracy importance score in an unbiased random forest (RF) algorithm based on conditional inference. Support vector machine, RF, elastic net (ENET) and nearest shrunken centroid algorithms were tested using different subsets of the top rank genes. Nested resampling for hyperparameter tuning was implemented with tenfold cross-validation in the outer and inner loops. RESULTS: The best classification was obtained with ENET using the expression of 200 genes in liver [area under the receiver operating characteristic curve (AUROC): 0.85; accuracy: 0.78] and 100 genes in duodenum (AUROC: 0.76; accuracy: 0.69). Canonical pathways and candidate genes that were previously reported as associated with FE in several species were identified. The most remarkable pathways and genes identified were NRF2-mediated oxidative stress response and aldosterone signalling in epithelial cells, the DNAJC6, DNAJC1, MAPK8, PRKD3 genes in duodenum, and melatonin degradation II, PPARα/RXRα activation, and GPCR-mediated nutrient sensing in enteroendocrine cells and SMOX, IL4I1, PRKAR2B, CLOCK and CCK genes in liver. CONCLUSIONS: ML algorithms and RNA-Seq expression data were found to provide good performance for classifying pigs into high or low RFI groups. Classification was better with gene expression data from liver than from duodenum. Genes associated with FE in liver and duodenum tissue that can be used as predictive biomarkers for this trait were identified. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12711-019-0453-y) contains supplementary material, which is available to authorized users. BioMed Central 2019-03-13 /pmc/articles/PMC6417084/ /pubmed/30866799 http://dx.doi.org/10.1186/s12711-019-0453-y Text en © The Author(s) 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Piles, Miriam
Fernandez-Lozano, Carlos
Velasco-Galilea, María
González-Rodríguez, Olga
Sánchez, Juan Pablo
Torrallardona, David
Ballester, Maria
Quintanilla, Raquel
Machine learning applied to transcriptomic data to identify genes associated with feed efficiency in pigs
title Machine learning applied to transcriptomic data to identify genes associated with feed efficiency in pigs
title_full Machine learning applied to transcriptomic data to identify genes associated with feed efficiency in pigs
title_fullStr Machine learning applied to transcriptomic data to identify genes associated with feed efficiency in pigs
title_full_unstemmed Machine learning applied to transcriptomic data to identify genes associated with feed efficiency in pigs
title_short Machine learning applied to transcriptomic data to identify genes associated with feed efficiency in pigs
title_sort machine learning applied to transcriptomic data to identify genes associated with feed efficiency in pigs
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6417084/
https://www.ncbi.nlm.nih.gov/pubmed/30866799
http://dx.doi.org/10.1186/s12711-019-0453-y
work_keys_str_mv AT pilesmiriam machinelearningappliedtotranscriptomicdatatoidentifygenesassociatedwithfeedefficiencyinpigs
AT fernandezlozanocarlos machinelearningappliedtotranscriptomicdatatoidentifygenesassociatedwithfeedefficiencyinpigs
AT velascogalileamaria machinelearningappliedtotranscriptomicdatatoidentifygenesassociatedwithfeedefficiencyinpigs
AT gonzalezrodriguezolga machinelearningappliedtotranscriptomicdatatoidentifygenesassociatedwithfeedefficiencyinpigs
AT sanchezjuanpablo machinelearningappliedtotranscriptomicdatatoidentifygenesassociatedwithfeedefficiencyinpigs
AT torrallardonadavid machinelearningappliedtotranscriptomicdatatoidentifygenesassociatedwithfeedefficiencyinpigs
AT ballestermaria machinelearningappliedtotranscriptomicdatatoidentifygenesassociatedwithfeedefficiencyinpigs
AT quintanillaraquel machinelearningappliedtotranscriptomicdatatoidentifygenesassociatedwithfeedefficiencyinpigs