Cargando…
Using Machine Learning to Identify Biomarkers Affecting Fat Deposition in Pigs by Integrating Multisource Transcriptome Information
[Image: see text] Fat deposition in pigs is not only closely related to pig production efficiency and pork quality but also an ideal model for human obesity. Transcriptome sequencing is widely used to study fat deposition. However, due to small sample sizes, high false positive rates, and poor consi...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Chemical Society
2022
|
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9413214/ https://www.ncbi.nlm.nih.gov/pubmed/35953074 http://dx.doi.org/10.1021/acs.jafc.2c03339 |
_version_ | 1784775687365722112 |
---|---|
author | Liu, Huatao Xing, Kai Jiang, Yifan Liu, Yibing Wang, Chuduan Ding, Xiangdong |
author_facet | Liu, Huatao Xing, Kai Jiang, Yifan Liu, Yibing Wang, Chuduan Ding, Xiangdong |
author_sort | Liu, Huatao |
collection | PubMed |
description | [Image: see text] Fat deposition in pigs is not only closely related to pig production efficiency and pork quality but also an ideal model for human obesity. Transcriptome sequencing is widely used to study fat deposition. However, due to small sample sizes, high false positive rates, and poor consistency of results from different studies, new strategies are urgently needed. Machine learning, a new analysis method, can effectively fit complex data and accurately identify samples and genes. In this study, 36 samples of adipose tissue, muscle tissue, and liver tissue were collected from Songliao black pigs and Landrace pigs, and the mRNA of all the samples was sequenced. In addition, we collected transcriptome data for 64 samples in the GEO database from four different sources. After standardization and imputation of missing values in the data set comprising 100 samples, traditional differential expression analysis was carried out, and different numbers of expressed genes were selected as features for the training model of eight machine learning methods. In the 1000 replications of fourfold cross validation with 100 samples, AdaBoost performed best, with an average prediction accuracy greater than 93% and the highest mean area under the curve in predicting the high- and low-fat content groups among the eight ML methods. According to their performance-based ranks inferred by AdaBoost, 12 genes related to fat deposition were identified; among them, FASN and APOD were specifically expressed in adipose tissue, and APOA1 was specifically expressed in the liver, which could be important candidate biomarkers affecting fat deposition. |
format | Online Article Text |
id | pubmed-9413214 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | American Chemical Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-94132142022-08-27 Using Machine Learning to Identify Biomarkers Affecting Fat Deposition in Pigs by Integrating Multisource Transcriptome Information Liu, Huatao Xing, Kai Jiang, Yifan Liu, Yibing Wang, Chuduan Ding, Xiangdong J Agric Food Chem [Image: see text] Fat deposition in pigs is not only closely related to pig production efficiency and pork quality but also an ideal model for human obesity. Transcriptome sequencing is widely used to study fat deposition. However, due to small sample sizes, high false positive rates, and poor consistency of results from different studies, new strategies are urgently needed. Machine learning, a new analysis method, can effectively fit complex data and accurately identify samples and genes. In this study, 36 samples of adipose tissue, muscle tissue, and liver tissue were collected from Songliao black pigs and Landrace pigs, and the mRNA of all the samples was sequenced. In addition, we collected transcriptome data for 64 samples in the GEO database from four different sources. After standardization and imputation of missing values in the data set comprising 100 samples, traditional differential expression analysis was carried out, and different numbers of expressed genes were selected as features for the training model of eight machine learning methods. In the 1000 replications of fourfold cross validation with 100 samples, AdaBoost performed best, with an average prediction accuracy greater than 93% and the highest mean area under the curve in predicting the high- and low-fat content groups among the eight ML methods. According to their performance-based ranks inferred by AdaBoost, 12 genes related to fat deposition were identified; among them, FASN and APOD were specifically expressed in adipose tissue, and APOA1 was specifically expressed in the liver, which could be important candidate biomarkers affecting fat deposition. American Chemical Society 2022-08-11 2022-08-24 /pmc/articles/PMC9413214/ /pubmed/35953074 http://dx.doi.org/10.1021/acs.jafc.2c03339 Text en © 2022 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by-nc-nd/4.0/Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Liu, Huatao Xing, Kai Jiang, Yifan Liu, Yibing Wang, Chuduan Ding, Xiangdong Using Machine Learning to Identify Biomarkers Affecting Fat Deposition in Pigs by Integrating Multisource Transcriptome Information |
title | Using Machine Learning
to Identify Biomarkers Affecting
Fat Deposition in Pigs by Integrating Multisource Transcriptome Information |
title_full | Using Machine Learning
to Identify Biomarkers Affecting
Fat Deposition in Pigs by Integrating Multisource Transcriptome Information |
title_fullStr | Using Machine Learning
to Identify Biomarkers Affecting
Fat Deposition in Pigs by Integrating Multisource Transcriptome Information |
title_full_unstemmed | Using Machine Learning
to Identify Biomarkers Affecting
Fat Deposition in Pigs by Integrating Multisource Transcriptome Information |
title_short | Using Machine Learning
to Identify Biomarkers Affecting
Fat Deposition in Pigs by Integrating Multisource Transcriptome Information |
title_sort | using machine learning
to identify biomarkers affecting
fat deposition in pigs by integrating multisource transcriptome information |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9413214/ https://www.ncbi.nlm.nih.gov/pubmed/35953074 http://dx.doi.org/10.1021/acs.jafc.2c03339 |
work_keys_str_mv | AT liuhuatao usingmachinelearningtoidentifybiomarkersaffectingfatdepositioninpigsbyintegratingmultisourcetranscriptomeinformation AT xingkai usingmachinelearningtoidentifybiomarkersaffectingfatdepositioninpigsbyintegratingmultisourcetranscriptomeinformation AT jiangyifan usingmachinelearningtoidentifybiomarkersaffectingfatdepositioninpigsbyintegratingmultisourcetranscriptomeinformation AT liuyibing usingmachinelearningtoidentifybiomarkersaffectingfatdepositioninpigsbyintegratingmultisourcetranscriptomeinformation AT wangchuduan usingmachinelearningtoidentifybiomarkersaffectingfatdepositioninpigsbyintegratingmultisourcetranscriptomeinformation AT dingxiangdong usingmachinelearningtoidentifybiomarkersaffectingfatdepositioninpigsbyintegratingmultisourcetranscriptomeinformation |