Cargando…
The C-Score: A Bayesian Framework to Sharply Improve Proteoform Scoring in High-Throughput Top Down Proteomics
[Image: see text] The automated processing of data generated by top down proteomics would benefit from improved scoring for protein identification and characterization of highly related protein forms (proteoforms). Here we propose the “C-score” (short for Characterization Score), a Bayesian approach...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Chemical
Society
2014
|
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4084843/ https://www.ncbi.nlm.nih.gov/pubmed/24922115 http://dx.doi.org/10.1021/pr401277r |
_version_ | 1782324574482857984 |
---|---|
author | LeDuc, Richard D. Fellers, Ryan T. Early, Bryan P. Greer, Joseph B. Thomas, Paul M. Kelleher, Neil L. |
author_facet | LeDuc, Richard D. Fellers, Ryan T. Early, Bryan P. Greer, Joseph B. Thomas, Paul M. Kelleher, Neil L. |
author_sort | LeDuc, Richard D. |
collection | PubMed |
description | [Image: see text] The automated processing of data generated by top down proteomics would benefit from improved scoring for protein identification and characterization of highly related protein forms (proteoforms). Here we propose the “C-score” (short for Characterization Score), a Bayesian approach to the proteoform identification and characterization problem, implemented within a framework to allow the infusion of expert knowledge into generative models that take advantage of known properties of proteins and top down analytical systems (e.g., fragmentation propensities, “off-by-1 Da” discontinuous errors, and intelligent weighting for site-specific modifications). The performance of the scoring system based on the initial generative models was compared to the current probability-based scoring system used within both ProSightPC and ProSightPTM on a manually curated set of 295 human proteoforms. The current implementation of the C-score framework generated a marked improvement over the existing scoring system as measured by the area under the curve on the resulting ROC chart (AUC of 0.99 versus 0.78). |
format | Online Article Text |
id | pubmed-4084843 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | American Chemical
Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-40848432015-06-12 The C-Score: A Bayesian Framework to Sharply Improve Proteoform Scoring in High-Throughput Top Down Proteomics LeDuc, Richard D. Fellers, Ryan T. Early, Bryan P. Greer, Joseph B. Thomas, Paul M. Kelleher, Neil L. J Proteome Res [Image: see text] The automated processing of data generated by top down proteomics would benefit from improved scoring for protein identification and characterization of highly related protein forms (proteoforms). Here we propose the “C-score” (short for Characterization Score), a Bayesian approach to the proteoform identification and characterization problem, implemented within a framework to allow the infusion of expert knowledge into generative models that take advantage of known properties of proteins and top down analytical systems (e.g., fragmentation propensities, “off-by-1 Da” discontinuous errors, and intelligent weighting for site-specific modifications). The performance of the scoring system based on the initial generative models was compared to the current probability-based scoring system used within both ProSightPC and ProSightPTM on a manually curated set of 295 human proteoforms. The current implementation of the C-score framework generated a marked improvement over the existing scoring system as measured by the area under the curve on the resulting ROC chart (AUC of 0.99 versus 0.78). American Chemical Society 2014-06-12 2014-07-03 /pmc/articles/PMC4084843/ /pubmed/24922115 http://dx.doi.org/10.1021/pr401277r Text en Copyright © 2014 American Chemical Society Terms of Use (http://pubs.acs.org/page/policy/authorchoice_termsofuse.html) |
spellingShingle | LeDuc, Richard D. Fellers, Ryan T. Early, Bryan P. Greer, Joseph B. Thomas, Paul M. Kelleher, Neil L. The C-Score: A Bayesian Framework to Sharply Improve Proteoform Scoring in High-Throughput Top Down Proteomics |
title | The C-Score: A Bayesian
Framework to Sharply
Improve Proteoform Scoring in High-Throughput Top Down Proteomics |
title_full | The C-Score: A Bayesian
Framework to Sharply
Improve Proteoform Scoring in High-Throughput Top Down Proteomics |
title_fullStr | The C-Score: A Bayesian
Framework to Sharply
Improve Proteoform Scoring in High-Throughput Top Down Proteomics |
title_full_unstemmed | The C-Score: A Bayesian
Framework to Sharply
Improve Proteoform Scoring in High-Throughput Top Down Proteomics |
title_short | The C-Score: A Bayesian
Framework to Sharply
Improve Proteoform Scoring in High-Throughput Top Down Proteomics |
title_sort | c-score: a bayesian
framework to sharply
improve proteoform scoring in high-throughput top down proteomics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4084843/ https://www.ncbi.nlm.nih.gov/pubmed/24922115 http://dx.doi.org/10.1021/pr401277r |
work_keys_str_mv | AT leducrichardd thecscoreabayesianframeworktosharplyimproveproteoformscoringinhighthroughputtopdownproteomics AT fellersryant thecscoreabayesianframeworktosharplyimproveproteoformscoringinhighthroughputtopdownproteomics AT earlybryanp thecscoreabayesianframeworktosharplyimproveproteoformscoringinhighthroughputtopdownproteomics AT greerjosephb thecscoreabayesianframeworktosharplyimproveproteoformscoringinhighthroughputtopdownproteomics AT thomaspaulm thecscoreabayesianframeworktosharplyimproveproteoformscoringinhighthroughputtopdownproteomics AT kelleherneill thecscoreabayesianframeworktosharplyimproveproteoformscoringinhighthroughputtopdownproteomics AT leducrichardd cscoreabayesianframeworktosharplyimproveproteoformscoringinhighthroughputtopdownproteomics AT fellersryant cscoreabayesianframeworktosharplyimproveproteoformscoringinhighthroughputtopdownproteomics AT earlybryanp cscoreabayesianframeworktosharplyimproveproteoformscoringinhighthroughputtopdownproteomics AT greerjosephb cscoreabayesianframeworktosharplyimproveproteoformscoringinhighthroughputtopdownproteomics AT thomaspaulm cscoreabayesianframeworktosharplyimproveproteoformscoringinhighthroughputtopdownproteomics AT kelleherneill cscoreabayesianframeworktosharplyimproveproteoformscoringinhighthroughputtopdownproteomics |