Cargando…

Comprehensive analysis of human protein N-termini enables assessment of various protein forms

Various forms of protein (proteoforms) are generated by genetic variations, alternative splicing, alternative translation initiation, co- or post-translational modification and proteolysis. Different proteoforms are in part discovered by characterizing their N-terminal sequences. Here, we introduce...

Descripción completa

Detalles Bibliográficos
Autores principales: Yeom, Jeonghun, Ju, Shinyeong, Choi, YunJin, Paek, Eunok, Lee, Cheolju
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5529458/
https://www.ncbi.nlm.nih.gov/pubmed/28747677
http://dx.doi.org/10.1038/s41598-017-06314-9
_version_ 1783253127168786432
author Yeom, Jeonghun
Ju, Shinyeong
Choi, YunJin
Paek, Eunok
Lee, Cheolju
author_facet Yeom, Jeonghun
Ju, Shinyeong
Choi, YunJin
Paek, Eunok
Lee, Cheolju
author_sort Yeom, Jeonghun
collection PubMed
description Various forms of protein (proteoforms) are generated by genetic variations, alternative splicing, alternative translation initiation, co- or post-translational modification and proteolysis. Different proteoforms are in part discovered by characterizing their N-terminal sequences. Here, we introduce an N-terminal-peptide-enrichment method, Nrich. Filter-aided negative selection formed the basis for the use of two N-blocking reagents and two endoproteases in this method. We identified 6,525 acetylated (or partially acetylated) and 6,570 free protein N-termini arising from 5,727 proteins in HEK293T human cells. The protein N-termini included translation initiation sites annotated in the UniProtKB database, putative alternative translational initiation sites, and N-terminal sites exposed after signal/transit/pro-peptide removal or unknown processing, revealing various proteoforms in cells. In addition, 46 novel protein N-termini were identified in 5′ untranslated region (UTR) sequence with pseudo start codons. Our data showing the observation of N-terminal sequences of mature proteins constitutes a useful resource that may provide information for a better understanding of various proteoforms in cells.
format Online
Article
Text
id pubmed-5529458
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-55294582017-08-02 Comprehensive analysis of human protein N-termini enables assessment of various protein forms Yeom, Jeonghun Ju, Shinyeong Choi, YunJin Paek, Eunok Lee, Cheolju Sci Rep Article Various forms of protein (proteoforms) are generated by genetic variations, alternative splicing, alternative translation initiation, co- or post-translational modification and proteolysis. Different proteoforms are in part discovered by characterizing their N-terminal sequences. Here, we introduce an N-terminal-peptide-enrichment method, Nrich. Filter-aided negative selection formed the basis for the use of two N-blocking reagents and two endoproteases in this method. We identified 6,525 acetylated (or partially acetylated) and 6,570 free protein N-termini arising from 5,727 proteins in HEK293T human cells. The protein N-termini included translation initiation sites annotated in the UniProtKB database, putative alternative translational initiation sites, and N-terminal sites exposed after signal/transit/pro-peptide removal or unknown processing, revealing various proteoforms in cells. In addition, 46 novel protein N-termini were identified in 5′ untranslated region (UTR) sequence with pseudo start codons. Our data showing the observation of N-terminal sequences of mature proteins constitutes a useful resource that may provide information for a better understanding of various proteoforms in cells. Nature Publishing Group UK 2017-07-26 /pmc/articles/PMC5529458/ /pubmed/28747677 http://dx.doi.org/10.1038/s41598-017-06314-9 Text en © The Author(s) 2017 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Yeom, Jeonghun
Ju, Shinyeong
Choi, YunJin
Paek, Eunok
Lee, Cheolju
Comprehensive analysis of human protein N-termini enables assessment of various protein forms
title Comprehensive analysis of human protein N-termini enables assessment of various protein forms
title_full Comprehensive analysis of human protein N-termini enables assessment of various protein forms
title_fullStr Comprehensive analysis of human protein N-termini enables assessment of various protein forms
title_full_unstemmed Comprehensive analysis of human protein N-termini enables assessment of various protein forms
title_short Comprehensive analysis of human protein N-termini enables assessment of various protein forms
title_sort comprehensive analysis of human protein n-termini enables assessment of various protein forms
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5529458/
https://www.ncbi.nlm.nih.gov/pubmed/28747677
http://dx.doi.org/10.1038/s41598-017-06314-9
work_keys_str_mv AT yeomjeonghun comprehensiveanalysisofhumanproteinnterminienablesassessmentofvariousproteinforms
AT jushinyeong comprehensiveanalysisofhumanproteinnterminienablesassessmentofvariousproteinforms
AT choiyunjin comprehensiveanalysisofhumanproteinnterminienablesassessmentofvariousproteinforms
AT paekeunok comprehensiveanalysisofhumanproteinnterminienablesassessmentofvariousproteinforms
AT leecheolju comprehensiveanalysisofhumanproteinnterminienablesassessmentofvariousproteinforms