Cargando…
Using Deep Learning to Extrapolate Protein Expression Measurements
Mass spectrometry (MS)‐based quantitative proteomics experiments typically assay a subset of up to 60% of the ≈20 000 human protein coding genes. Computational methods for imputing the missing values using RNA expression data usually allow only for imputations of proteins measured in at least some o...
Autores principales: | , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
John Wiley and Sons Inc.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7757209/ https://www.ncbi.nlm.nih.gov/pubmed/32937025 http://dx.doi.org/10.1002/pmic.202000009 |
_version_ | 1783626701776879616 |
---|---|
author | Barzine, Mitra Parissa Freivalds, Karlis Wright, James C. Opmanis, Mārtiņš Rituma, Darta Ghavidel, Fatemeh Zamanzad Jarnuczak, Andrew F. Celms, Edgars Čerāns, Kārlis Jonassen, Inge Lace, Lelde Antonio Vizcaíno, Juan Choudhary, Jyoti Sharma Brazma, Alvis Viksna, Juris |
author_facet | Barzine, Mitra Parissa Freivalds, Karlis Wright, James C. Opmanis, Mārtiņš Rituma, Darta Ghavidel, Fatemeh Zamanzad Jarnuczak, Andrew F. Celms, Edgars Čerāns, Kārlis Jonassen, Inge Lace, Lelde Antonio Vizcaíno, Juan Choudhary, Jyoti Sharma Brazma, Alvis Viksna, Juris |
author_sort | Barzine, Mitra Parissa |
collection | PubMed |
description | Mass spectrometry (MS)‐based quantitative proteomics experiments typically assay a subset of up to 60% of the ≈20 000 human protein coding genes. Computational methods for imputing the missing values using RNA expression data usually allow only for imputations of proteins measured in at least some of the samples. In silico methods for comprehensively estimating abundances across all proteins are still missing. Here, a novel method is proposed using deep learning to extrapolate the observed protein expression values in label‐free MS experiments to all proteins, leveraging gene functional annotations and RNA measurements as key predictive attributes. This method is tested on four datasets, including human cell lines and human and mouse tissues. This method predicts the protein expression values with average [Formula: see text] scores between 0.46 and 0.54, which is significantly better than predictions based on correlations using the RNA expression data alone. Moreover, it is demonstrated that the derived models can be “transferred” across experiments and species. For instance, the model derived from human tissues gave a [Formula: see text] when applied to mouse tissue data. It is concluded that protein abundances generated in label‐free MS experiments can be computationally predicted using functional annotated attributes and can be used to highlight aberrant protein abundance values. |
format | Online Article Text |
id | pubmed-7757209 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | John Wiley and Sons Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-77572092020-12-28 Using Deep Learning to Extrapolate Protein Expression Measurements Barzine, Mitra Parissa Freivalds, Karlis Wright, James C. Opmanis, Mārtiņš Rituma, Darta Ghavidel, Fatemeh Zamanzad Jarnuczak, Andrew F. Celms, Edgars Čerāns, Kārlis Jonassen, Inge Lace, Lelde Antonio Vizcaíno, Juan Choudhary, Jyoti Sharma Brazma, Alvis Viksna, Juris Proteomics Research Articles Mass spectrometry (MS)‐based quantitative proteomics experiments typically assay a subset of up to 60% of the ≈20 000 human protein coding genes. Computational methods for imputing the missing values using RNA expression data usually allow only for imputations of proteins measured in at least some of the samples. In silico methods for comprehensively estimating abundances across all proteins are still missing. Here, a novel method is proposed using deep learning to extrapolate the observed protein expression values in label‐free MS experiments to all proteins, leveraging gene functional annotations and RNA measurements as key predictive attributes. This method is tested on four datasets, including human cell lines and human and mouse tissues. This method predicts the protein expression values with average [Formula: see text] scores between 0.46 and 0.54, which is significantly better than predictions based on correlations using the RNA expression data alone. Moreover, it is demonstrated that the derived models can be “transferred” across experiments and species. For instance, the model derived from human tissues gave a [Formula: see text] when applied to mouse tissue data. It is concluded that protein abundances generated in label‐free MS experiments can be computationally predicted using functional annotated attributes and can be used to highlight aberrant protein abundance values. John Wiley and Sons Inc. 2020-10-16 2020-11 /pmc/articles/PMC7757209/ /pubmed/32937025 http://dx.doi.org/10.1002/pmic.202000009 Text en © 2020 The Authors. Proteomics published by Wiley‐VCH GmbH This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Articles Barzine, Mitra Parissa Freivalds, Karlis Wright, James C. Opmanis, Mārtiņš Rituma, Darta Ghavidel, Fatemeh Zamanzad Jarnuczak, Andrew F. Celms, Edgars Čerāns, Kārlis Jonassen, Inge Lace, Lelde Antonio Vizcaíno, Juan Choudhary, Jyoti Sharma Brazma, Alvis Viksna, Juris Using Deep Learning to Extrapolate Protein Expression Measurements |
title | Using Deep Learning to Extrapolate Protein Expression Measurements |
title_full | Using Deep Learning to Extrapolate Protein Expression Measurements |
title_fullStr | Using Deep Learning to Extrapolate Protein Expression Measurements |
title_full_unstemmed | Using Deep Learning to Extrapolate Protein Expression Measurements |
title_short | Using Deep Learning to Extrapolate Protein Expression Measurements |
title_sort | using deep learning to extrapolate protein expression measurements |
topic | Research Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7757209/ https://www.ncbi.nlm.nih.gov/pubmed/32937025 http://dx.doi.org/10.1002/pmic.202000009 |
work_keys_str_mv | AT barzinemitraparissa usingdeeplearningtoextrapolateproteinexpressionmeasurements AT freivaldskarlis usingdeeplearningtoextrapolateproteinexpressionmeasurements AT wrightjamesc usingdeeplearningtoextrapolateproteinexpressionmeasurements AT opmanismartins usingdeeplearningtoextrapolateproteinexpressionmeasurements AT ritumadarta usingdeeplearningtoextrapolateproteinexpressionmeasurements AT ghavidelfatemehzamanzad usingdeeplearningtoextrapolateproteinexpressionmeasurements AT jarnuczakandrewf usingdeeplearningtoextrapolateproteinexpressionmeasurements AT celmsedgars usingdeeplearningtoextrapolateproteinexpressionmeasurements AT ceranskarlis usingdeeplearningtoextrapolateproteinexpressionmeasurements AT jonasseninge usingdeeplearningtoextrapolateproteinexpressionmeasurements AT lacelelde usingdeeplearningtoextrapolateproteinexpressionmeasurements AT antoniovizcainojuan usingdeeplearningtoextrapolateproteinexpressionmeasurements AT choudharyjyotisharma usingdeeplearningtoextrapolateproteinexpressionmeasurements AT brazmaalvis usingdeeplearningtoextrapolateproteinexpressionmeasurements AT viksnajuris usingdeeplearningtoextrapolateproteinexpressionmeasurements |