Cargando…
CCPRD: A Novel Analytical Framework for the Comprehensive Proteomic Reference Database Construction of NonModel Organisms
[Image: see text] Protein reference databases are a critical part of producing efficient proteomic analyses. However, the method for constructing clean, efficient, and comprehensive protein reference databases of nonmodel organisms is lacking. Existing methods either do not have contamination contro...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Chemical Society
2020
|
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7331046/ https://www.ncbi.nlm.nih.gov/pubmed/32637811 http://dx.doi.org/10.1021/acsomega.0c01278 |
_version_ | 1783553241617793024 |
---|---|
author | Guo, Qingxiang Li, Dan Zhai, Yanhua Gu, Zemao |
author_facet | Guo, Qingxiang Li, Dan Zhai, Yanhua Gu, Zemao |
author_sort | Guo, Qingxiang |
collection | PubMed |
description | [Image: see text] Protein reference databases are a critical part of producing efficient proteomic analyses. However, the method for constructing clean, efficient, and comprehensive protein reference databases of nonmodel organisms is lacking. Existing methods either do not have contamination control procedures, or these methods rely on a three-frame and/or six-frame translation that sharply increases the search space and the need for computational resources. Herein, we propose a framework for constructing a customized comprehensive proteomic reference database (CCPRD) from draft genomes and deep sequencing transcriptomes. Its effectiveness is demonstrated by incorporating the proteomes of nematocysts from endoparasitic cnidarian: myxozoans. By applying customized contamination removal procedures, contaminations in omic data were successfully identified and removed. This is an effective method that does not result in overdecontamination. This can be shown by comparing the CCPRD MS results with an artificially contaminated database and another database with removed contaminations in genomes and transcriptomes added back. CCPRD outperformed traditional frame-based methods by identifying 35.2–50.7% more peptides and 35.8–43.8% more proteins, with a maximum of 84.6% in size reduction. A BUSCO analysis showed that the CCPRD maintained a relatively high level of completeness compared to traditional methods. These results confirm the superiority of the CCPRD over existing methods in peptide and protein identification numbers, database size, and completeness. By providing a general framework for generating the reference database, the CCPRD, which does not need a high-quality genome, can potentially be applied to nonmodel organisms and significantly contribute to proteomic research. |
format | Online Article Text |
id | pubmed-7331046 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | American Chemical Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-73310462020-07-06 CCPRD: A Novel Analytical Framework for the Comprehensive Proteomic Reference Database Construction of NonModel Organisms Guo, Qingxiang Li, Dan Zhai, Yanhua Gu, Zemao ACS Omega [Image: see text] Protein reference databases are a critical part of producing efficient proteomic analyses. However, the method for constructing clean, efficient, and comprehensive protein reference databases of nonmodel organisms is lacking. Existing methods either do not have contamination control procedures, or these methods rely on a three-frame and/or six-frame translation that sharply increases the search space and the need for computational resources. Herein, we propose a framework for constructing a customized comprehensive proteomic reference database (CCPRD) from draft genomes and deep sequencing transcriptomes. Its effectiveness is demonstrated by incorporating the proteomes of nematocysts from endoparasitic cnidarian: myxozoans. By applying customized contamination removal procedures, contaminations in omic data were successfully identified and removed. This is an effective method that does not result in overdecontamination. This can be shown by comparing the CCPRD MS results with an artificially contaminated database and another database with removed contaminations in genomes and transcriptomes added back. CCPRD outperformed traditional frame-based methods by identifying 35.2–50.7% more peptides and 35.8–43.8% more proteins, with a maximum of 84.6% in size reduction. A BUSCO analysis showed that the CCPRD maintained a relatively high level of completeness compared to traditional methods. These results confirm the superiority of the CCPRD over existing methods in peptide and protein identification numbers, database size, and completeness. By providing a general framework for generating the reference database, the CCPRD, which does not need a high-quality genome, can potentially be applied to nonmodel organisms and significantly contribute to proteomic research. American Chemical Society 2020-06-17 /pmc/articles/PMC7331046/ /pubmed/32637811 http://dx.doi.org/10.1021/acsomega.0c01278 Text en Copyright © 2020 American Chemical Society This is an open access article published under an ACS AuthorChoice License (http://pubs.acs.org/page/policy/authorchoice_termsofuse.html) , which permits copying and redistribution of the article or any adaptations for non-commercial purposes. |
spellingShingle | Guo, Qingxiang Li, Dan Zhai, Yanhua Gu, Zemao CCPRD: A Novel Analytical Framework for the Comprehensive Proteomic Reference Database Construction of NonModel Organisms |
title | CCPRD: A Novel Analytical Framework
for the Comprehensive Proteomic Reference Database Construction of
NonModel Organisms |
title_full | CCPRD: A Novel Analytical Framework
for the Comprehensive Proteomic Reference Database Construction of
NonModel Organisms |
title_fullStr | CCPRD: A Novel Analytical Framework
for the Comprehensive Proteomic Reference Database Construction of
NonModel Organisms |
title_full_unstemmed | CCPRD: A Novel Analytical Framework
for the Comprehensive Proteomic Reference Database Construction of
NonModel Organisms |
title_short | CCPRD: A Novel Analytical Framework
for the Comprehensive Proteomic Reference Database Construction of
NonModel Organisms |
title_sort | ccprd: a novel analytical framework
for the comprehensive proteomic reference database construction of
nonmodel organisms |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7331046/ https://www.ncbi.nlm.nih.gov/pubmed/32637811 http://dx.doi.org/10.1021/acsomega.0c01278 |
work_keys_str_mv | AT guoqingxiang ccprdanovelanalyticalframeworkforthecomprehensiveproteomicreferencedatabaseconstructionofnonmodelorganisms AT lidan ccprdanovelanalyticalframeworkforthecomprehensiveproteomicreferencedatabaseconstructionofnonmodelorganisms AT zhaiyanhua ccprdanovelanalyticalframeworkforthecomprehensiveproteomicreferencedatabaseconstructionofnonmodelorganisms AT guzemao ccprdanovelanalyticalframeworkforthecomprehensiveproteomicreferencedatabaseconstructionofnonmodelorganisms |