Cargando…

PrESOgenesis: A two-layer multi-label predictor for identifying fertility-related proteins using support vector machine and pseudo amino acid composition approach

Successful spermatogenesis and oogenesis are the two genetically independent processes preceding embryo development. To date, several fertility-related proteins have been described in mammalian species. Nevertheless, further studies are required to discover more proteins associated with the developm...

Descripción completa

Detalles Bibliográficos
Autores principales: Bakhtiarizadeh, Mohammad Reza, Rahimi, Maryam, Mohammadi-Sangcheshmeh, Abdollah, Shariati J, Vahid, Salami, Seyed Alireza
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5998058/
https://www.ncbi.nlm.nih.gov/pubmed/29899414
http://dx.doi.org/10.1038/s41598-018-27338-9
_version_ 1783331176912519168
author Bakhtiarizadeh, Mohammad Reza
Rahimi, Maryam
Mohammadi-Sangcheshmeh, Abdollah
Shariati J, Vahid
Salami, Seyed Alireza
author_facet Bakhtiarizadeh, Mohammad Reza
Rahimi, Maryam
Mohammadi-Sangcheshmeh, Abdollah
Shariati J, Vahid
Salami, Seyed Alireza
author_sort Bakhtiarizadeh, Mohammad Reza
collection PubMed
description Successful spermatogenesis and oogenesis are the two genetically independent processes preceding embryo development. To date, several fertility-related proteins have been described in mammalian species. Nevertheless, further studies are required to discover more proteins associated with the development of germ cells and embryogenesis in order to shed more light on the processes. This work builds on our previous software (OOgenesis_Pred), mainly focusing on algorithms beyond what was previously done, in particular new fertility-related proteins and their classes (embryogenesis, spermatogenesis and oogenesis) based on the support vector machine according to the concept of Chou’s pseudo-amino acid composition features. The results of five-fold cross validation, as well as the independent test demonstrated that this method is capable of predicting the fertility-related proteins and their classes with accuracy of more than 80%. Moreover, by using feature selection methods, important properties of fertility-related proteins were identified that allowed for their accurate classification. Based on the proposed method, a two-layer classifier software, named as “PrESOgenesis” (https://github.com/mrb20045/PrESOgenesis) was developed. The tool identified a query sequence (protein or transcript) as fertility or non-fertility-related protein at the first layer and then classified the predicted fertility-related protein into different classes of embryogenesis, spermatogenesis or oogenesis at the second layer.
format Online
Article
Text
id pubmed-5998058
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-59980582018-06-21 PrESOgenesis: A two-layer multi-label predictor for identifying fertility-related proteins using support vector machine and pseudo amino acid composition approach Bakhtiarizadeh, Mohammad Reza Rahimi, Maryam Mohammadi-Sangcheshmeh, Abdollah Shariati J, Vahid Salami, Seyed Alireza Sci Rep Article Successful spermatogenesis and oogenesis are the two genetically independent processes preceding embryo development. To date, several fertility-related proteins have been described in mammalian species. Nevertheless, further studies are required to discover more proteins associated with the development of germ cells and embryogenesis in order to shed more light on the processes. This work builds on our previous software (OOgenesis_Pred), mainly focusing on algorithms beyond what was previously done, in particular new fertility-related proteins and their classes (embryogenesis, spermatogenesis and oogenesis) based on the support vector machine according to the concept of Chou’s pseudo-amino acid composition features. The results of five-fold cross validation, as well as the independent test demonstrated that this method is capable of predicting the fertility-related proteins and their classes with accuracy of more than 80%. Moreover, by using feature selection methods, important properties of fertility-related proteins were identified that allowed for their accurate classification. Based on the proposed method, a two-layer classifier software, named as “PrESOgenesis” (https://github.com/mrb20045/PrESOgenesis) was developed. The tool identified a query sequence (protein or transcript) as fertility or non-fertility-related protein at the first layer and then classified the predicted fertility-related protein into different classes of embryogenesis, spermatogenesis or oogenesis at the second layer. Nature Publishing Group UK 2018-06-13 /pmc/articles/PMC5998058/ /pubmed/29899414 http://dx.doi.org/10.1038/s41598-018-27338-9 Text en © The Author(s) 2018 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Bakhtiarizadeh, Mohammad Reza
Rahimi, Maryam
Mohammadi-Sangcheshmeh, Abdollah
Shariati J, Vahid
Salami, Seyed Alireza
PrESOgenesis: A two-layer multi-label predictor for identifying fertility-related proteins using support vector machine and pseudo amino acid composition approach
title PrESOgenesis: A two-layer multi-label predictor for identifying fertility-related proteins using support vector machine and pseudo amino acid composition approach
title_full PrESOgenesis: A two-layer multi-label predictor for identifying fertility-related proteins using support vector machine and pseudo amino acid composition approach
title_fullStr PrESOgenesis: A two-layer multi-label predictor for identifying fertility-related proteins using support vector machine and pseudo amino acid composition approach
title_full_unstemmed PrESOgenesis: A two-layer multi-label predictor for identifying fertility-related proteins using support vector machine and pseudo amino acid composition approach
title_short PrESOgenesis: A two-layer multi-label predictor for identifying fertility-related proteins using support vector machine and pseudo amino acid composition approach
title_sort presogenesis: a two-layer multi-label predictor for identifying fertility-related proteins using support vector machine and pseudo amino acid composition approach
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5998058/
https://www.ncbi.nlm.nih.gov/pubmed/29899414
http://dx.doi.org/10.1038/s41598-018-27338-9
work_keys_str_mv AT bakhtiarizadehmohammadreza presogenesisatwolayermultilabelpredictorforidentifyingfertilityrelatedproteinsusingsupportvectormachineandpseudoaminoacidcompositionapproach
AT rahimimaryam presogenesisatwolayermultilabelpredictorforidentifyingfertilityrelatedproteinsusingsupportvectormachineandpseudoaminoacidcompositionapproach
AT mohammadisangcheshmehabdollah presogenesisatwolayermultilabelpredictorforidentifyingfertilityrelatedproteinsusingsupportvectormachineandpseudoaminoacidcompositionapproach
AT shariatijvahid presogenesisatwolayermultilabelpredictorforidentifyingfertilityrelatedproteinsusingsupportvectormachineandpseudoaminoacidcompositionapproach
AT salamiseyedalireza presogenesisatwolayermultilabelpredictorforidentifyingfertilityrelatedproteinsusingsupportvectormachineandpseudoaminoacidcompositionapproach