Cargando…

Using Machine Learning to Identify Biomarkers Affecting Fat Deposition in Pigs by Integrating Multisource Transcriptome Information

[Image: see text] Fat deposition in pigs is not only closely related to pig production efficiency and pork quality but also an ideal model for human obesity. Transcriptome sequencing is widely used to study fat deposition. However, due to small sample sizes, high false positive rates, and poor consi...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Huatao, Xing, Kai, Jiang, Yifan, Liu, Yibing, Wang, Chuduan, Ding, Xiangdong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2022
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9413214/
https://www.ncbi.nlm.nih.gov/pubmed/35953074
http://dx.doi.org/10.1021/acs.jafc.2c03339
_version_ 1784775687365722112
author Liu, Huatao
Xing, Kai
Jiang, Yifan
Liu, Yibing
Wang, Chuduan
Ding, Xiangdong
author_facet Liu, Huatao
Xing, Kai
Jiang, Yifan
Liu, Yibing
Wang, Chuduan
Ding, Xiangdong
author_sort Liu, Huatao
collection PubMed
description [Image: see text] Fat deposition in pigs is not only closely related to pig production efficiency and pork quality but also an ideal model for human obesity. Transcriptome sequencing is widely used to study fat deposition. However, due to small sample sizes, high false positive rates, and poor consistency of results from different studies, new strategies are urgently needed. Machine learning, a new analysis method, can effectively fit complex data and accurately identify samples and genes. In this study, 36 samples of adipose tissue, muscle tissue, and liver tissue were collected from Songliao black pigs and Landrace pigs, and the mRNA of all the samples was sequenced. In addition, we collected transcriptome data for 64 samples in the GEO database from four different sources. After standardization and imputation of missing values in the data set comprising 100 samples, traditional differential expression analysis was carried out, and different numbers of expressed genes were selected as features for the training model of eight machine learning methods. In the 1000 replications of fourfold cross validation with 100 samples, AdaBoost performed best, with an average prediction accuracy greater than 93% and the highest mean area under the curve in predicting the high- and low-fat content groups among the eight ML methods. According to their performance-based ranks inferred by AdaBoost, 12 genes related to fat deposition were identified; among them, FASN and APOD were specifically expressed in adipose tissue, and APOA1 was specifically expressed in the liver, which could be important candidate biomarkers affecting fat deposition.
format Online
Article
Text
id pubmed-9413214
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-94132142022-08-27 Using Machine Learning to Identify Biomarkers Affecting Fat Deposition in Pigs by Integrating Multisource Transcriptome Information Liu, Huatao Xing, Kai Jiang, Yifan Liu, Yibing Wang, Chuduan Ding, Xiangdong J Agric Food Chem [Image: see text] Fat deposition in pigs is not only closely related to pig production efficiency and pork quality but also an ideal model for human obesity. Transcriptome sequencing is widely used to study fat deposition. However, due to small sample sizes, high false positive rates, and poor consistency of results from different studies, new strategies are urgently needed. Machine learning, a new analysis method, can effectively fit complex data and accurately identify samples and genes. In this study, 36 samples of adipose tissue, muscle tissue, and liver tissue were collected from Songliao black pigs and Landrace pigs, and the mRNA of all the samples was sequenced. In addition, we collected transcriptome data for 64 samples in the GEO database from four different sources. After standardization and imputation of missing values in the data set comprising 100 samples, traditional differential expression analysis was carried out, and different numbers of expressed genes were selected as features for the training model of eight machine learning methods. In the 1000 replications of fourfold cross validation with 100 samples, AdaBoost performed best, with an average prediction accuracy greater than 93% and the highest mean area under the curve in predicting the high- and low-fat content groups among the eight ML methods. According to their performance-based ranks inferred by AdaBoost, 12 genes related to fat deposition were identified; among them, FASN and APOD were specifically expressed in adipose tissue, and APOA1 was specifically expressed in the liver, which could be important candidate biomarkers affecting fat deposition. American Chemical Society 2022-08-11 2022-08-24 /pmc/articles/PMC9413214/ /pubmed/35953074 http://dx.doi.org/10.1021/acs.jafc.2c03339 Text en © 2022 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by-nc-nd/4.0/Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Liu, Huatao
Xing, Kai
Jiang, Yifan
Liu, Yibing
Wang, Chuduan
Ding, Xiangdong
Using Machine Learning to Identify Biomarkers Affecting Fat Deposition in Pigs by Integrating Multisource Transcriptome Information
title Using Machine Learning to Identify Biomarkers Affecting Fat Deposition in Pigs by Integrating Multisource Transcriptome Information
title_full Using Machine Learning to Identify Biomarkers Affecting Fat Deposition in Pigs by Integrating Multisource Transcriptome Information
title_fullStr Using Machine Learning to Identify Biomarkers Affecting Fat Deposition in Pigs by Integrating Multisource Transcriptome Information
title_full_unstemmed Using Machine Learning to Identify Biomarkers Affecting Fat Deposition in Pigs by Integrating Multisource Transcriptome Information
title_short Using Machine Learning to Identify Biomarkers Affecting Fat Deposition in Pigs by Integrating Multisource Transcriptome Information
title_sort using machine learning to identify biomarkers affecting fat deposition in pigs by integrating multisource transcriptome information
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9413214/
https://www.ncbi.nlm.nih.gov/pubmed/35953074
http://dx.doi.org/10.1021/acs.jafc.2c03339
work_keys_str_mv AT liuhuatao usingmachinelearningtoidentifybiomarkersaffectingfatdepositioninpigsbyintegratingmultisourcetranscriptomeinformation
AT xingkai usingmachinelearningtoidentifybiomarkersaffectingfatdepositioninpigsbyintegratingmultisourcetranscriptomeinformation
AT jiangyifan usingmachinelearningtoidentifybiomarkersaffectingfatdepositioninpigsbyintegratingmultisourcetranscriptomeinformation
AT liuyibing usingmachinelearningtoidentifybiomarkersaffectingfatdepositioninpigsbyintegratingmultisourcetranscriptomeinformation
AT wangchuduan usingmachinelearningtoidentifybiomarkersaffectingfatdepositioninpigsbyintegratingmultisourcetranscriptomeinformation
AT dingxiangdong usingmachinelearningtoidentifybiomarkersaffectingfatdepositioninpigsbyintegratingmultisourcetranscriptomeinformation