Cargando…
PPA-GCN: A Efficient GCN Framework for Prokaryotic Pathways Assignment
With the rapid development of sequencing technology, completed genomes of microbes have explosively emerged. For a newly sequenced prokaryotic genome, gene functional annotation and metabolism pathway assignment are important foundations for all subsequent research work. However, the assignment rate...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9013948/ https://www.ncbi.nlm.nih.gov/pubmed/35444686 http://dx.doi.org/10.3389/fgene.2022.839453 |
_version_ | 1784688109746651136 |
---|---|
author | Lu, Yuntao Li, Qi Li, Tao |
author_facet | Lu, Yuntao Li, Qi Li, Tao |
author_sort | Lu, Yuntao |
collection | PubMed |
description | With the rapid development of sequencing technology, completed genomes of microbes have explosively emerged. For a newly sequenced prokaryotic genome, gene functional annotation and metabolism pathway assignment are important foundations for all subsequent research work. However, the assignment rate for gene metabolism pathways is lower than 48% on the whole. It is even lower for newly sequenced prokaryotic genomes, which has become a bottleneck for subsequent research. Thus, the development of a high-precision metabolic pathway assignment framework is urgently needed. Here, we developed PPA-GCN, a prokaryotic pathways assignment framework based on graph convolutional network, to assist functional pathway assignments using KEGG information and genomic characteristics. In the framework, genomic gene synteny information was used to construct a network, and ideas of self-supervised learning were inspired to enhance the framework’s learning ability. Our framework is applicable to the genera of microbe with sufficient whole genome sequences. To evaluate the assignment rate, genomes from three different genera (Flavobacterium (65 genomes) and Pseudomonas (100 genomes), Staphylococcus (500 genomes)) were used. The initial functional pathway assignment rate of the three test genera were 27.7% (Flavobacterium), 49.5% (Pseudomonas) and 30.1% (Staphylococcus). PPA-GCN achieved excellence performance of 84.8% (Flavobacterium), 77.0% (Pseudomonas) and 71.0% (Staphylococcus) for assignment rate. At the same time, PPA-GCN was proved to have strong fault tolerance. The framework provides novel insights into assignment for metabolism pathways and is likely to inform future deep learning applications for interpreting functional annotations and extends to all prokaryotic genera with sufficient genomes. |
format | Online Article Text |
id | pubmed-9013948 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-90139482022-04-19 PPA-GCN: A Efficient GCN Framework for Prokaryotic Pathways Assignment Lu, Yuntao Li, Qi Li, Tao Front Genet Genetics With the rapid development of sequencing technology, completed genomes of microbes have explosively emerged. For a newly sequenced prokaryotic genome, gene functional annotation and metabolism pathway assignment are important foundations for all subsequent research work. However, the assignment rate for gene metabolism pathways is lower than 48% on the whole. It is even lower for newly sequenced prokaryotic genomes, which has become a bottleneck for subsequent research. Thus, the development of a high-precision metabolic pathway assignment framework is urgently needed. Here, we developed PPA-GCN, a prokaryotic pathways assignment framework based on graph convolutional network, to assist functional pathway assignments using KEGG information and genomic characteristics. In the framework, genomic gene synteny information was used to construct a network, and ideas of self-supervised learning were inspired to enhance the framework’s learning ability. Our framework is applicable to the genera of microbe with sufficient whole genome sequences. To evaluate the assignment rate, genomes from three different genera (Flavobacterium (65 genomes) and Pseudomonas (100 genomes), Staphylococcus (500 genomes)) were used. The initial functional pathway assignment rate of the three test genera were 27.7% (Flavobacterium), 49.5% (Pseudomonas) and 30.1% (Staphylococcus). PPA-GCN achieved excellence performance of 84.8% (Flavobacterium), 77.0% (Pseudomonas) and 71.0% (Staphylococcus) for assignment rate. At the same time, PPA-GCN was proved to have strong fault tolerance. The framework provides novel insights into assignment for metabolism pathways and is likely to inform future deep learning applications for interpreting functional annotations and extends to all prokaryotic genera with sufficient genomes. Frontiers Media S.A. 2022-04-04 /pmc/articles/PMC9013948/ /pubmed/35444686 http://dx.doi.org/10.3389/fgene.2022.839453 Text en Copyright © 2022 Lu, Li and Li. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Lu, Yuntao Li, Qi Li, Tao PPA-GCN: A Efficient GCN Framework for Prokaryotic Pathways Assignment |
title | PPA-GCN: A Efficient GCN Framework for Prokaryotic Pathways Assignment |
title_full | PPA-GCN: A Efficient GCN Framework for Prokaryotic Pathways Assignment |
title_fullStr | PPA-GCN: A Efficient GCN Framework for Prokaryotic Pathways Assignment |
title_full_unstemmed | PPA-GCN: A Efficient GCN Framework for Prokaryotic Pathways Assignment |
title_short | PPA-GCN: A Efficient GCN Framework for Prokaryotic Pathways Assignment |
title_sort | ppa-gcn: a efficient gcn framework for prokaryotic pathways assignment |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9013948/ https://www.ncbi.nlm.nih.gov/pubmed/35444686 http://dx.doi.org/10.3389/fgene.2022.839453 |
work_keys_str_mv | AT luyuntao ppagcnaefficientgcnframeworkforprokaryoticpathwaysassignment AT liqi ppagcnaefficientgcnframeworkforprokaryoticpathwaysassignment AT litao ppagcnaefficientgcnframeworkforprokaryoticpathwaysassignment |