Cargando…

PCycDB: a comprehensive and accurate database for fast analysis of phosphorus cycling genes

BACKGROUND: Phosphorus (P) is one of the most essential macronutrients on the planet, and microorganisms (including bacteria and archaea) play a key role in P cycling in all living things and ecosystems. However, our comprehensive understanding of key P cycling genes (PCGs) and microorganisms (PCMs)...

Descripción completa

Detalles Bibliográficos
Autores principales: Zeng, Jiaxiong, Tu, Qichao, Yu, Xiaoli, Qian, Lu, Wang, Cheng, Shu, Longfei, Liu, Fei, Liu, Shengwei, Huang, Zhijian, He, Jianguo, Yan, Qingyun, He, Zhili
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9252087/
https://www.ncbi.nlm.nih.gov/pubmed/35787295
http://dx.doi.org/10.1186/s40168-022-01292-1
Descripción
Sumario:BACKGROUND: Phosphorus (P) is one of the most essential macronutrients on the planet, and microorganisms (including bacteria and archaea) play a key role in P cycling in all living things and ecosystems. However, our comprehensive understanding of key P cycling genes (PCGs) and microorganisms (PCMs) as well as their ecological functions remains elusive even with the rapid advancement of metagenome sequencing technologies. One of major challenges is a lack of a comprehensive and accurately annotated P cycling functional gene database. RESULTS: In this study, we constructed a well-curated P cycling database (PCycDB) covering 139 gene families and 10 P metabolic processes, including several previously ignored PCGs such as pafA encoding phosphate-insensitive phosphatase, ptxABCD (phosphite-related genes), and novel aepXVWPS genes for 2-aminoethylphosphonate transporters. We achieved an annotation accuracy, positive predictive value (PPV), sensitivity, specificity, and negative predictive value (NPV) of 99.8%, 96.1%, 99.9%, 99.8%, and 99.9%, respectively, for simulated gene datasets. Compared to other orthology databases, PCycDB is more accurate, more comprehensive, and faster to profile the PCGs. We used PCycDB to analyze P cycling microbial communities from representative natural and engineered environments and showed that PCycDB could apply to different environments. CONCLUSIONS: We demonstrate that PCycDB is a powerful tool for advancing our understanding of microbially driven P cycling in the environment with high coverage, high accuracy, and rapid analysis of metagenome sequencing data. The PCycDB is available at https://github.com/ZengJiaxiong/Phosphorus-cycling-database. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40168-022-01292-1.