Cargando…

SmProt: A Reliable Repository with Comprehensive Annotation of Small Proteins Identified from Ribosome Profiling

Small proteins specifically refer to proteins consisting of less than 100 amino acids translated from small open reading frames (sORFs), which were usually missed in previous genome annotation. The significance of small proteins has been revealed in current years, along with the discovery of their d...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Yanyan, Zhou, Honghong, Chen, Xiaomin, Zheng, Yu, Kang, Quan, Hao, Di, Zhang, Lili, Song, Tingrui, Luo, Huaxia, Hao, Yajing, Chen, Runsheng, Zhang, Peng, He, Shunmin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9039559/
https://www.ncbi.nlm.nih.gov/pubmed/34536568
http://dx.doi.org/10.1016/j.gpb.2021.09.002
_version_ 1784694155659706368
author Li, Yanyan
Zhou, Honghong
Chen, Xiaomin
Zheng, Yu
Kang, Quan
Hao, Di
Zhang, Lili
Song, Tingrui
Luo, Huaxia
Hao, Yajing
Chen, Runsheng
Zhang, Peng
He, Shunmin
author_facet Li, Yanyan
Zhou, Honghong
Chen, Xiaomin
Zheng, Yu
Kang, Quan
Hao, Di
Zhang, Lili
Song, Tingrui
Luo, Huaxia
Hao, Yajing
Chen, Runsheng
Zhang, Peng
He, Shunmin
author_sort Li, Yanyan
collection PubMed
description Small proteins specifically refer to proteins consisting of less than 100 amino acids translated from small open reading frames (sORFs), which were usually missed in previous genome annotation. The significance of small proteins has been revealed in current years, along with the discovery of their diverse functions. However, systematic annotation of small proteins is still insufficient. SmProt was specially developed to provide valuable information on small proteins for scientific community. Here we present the update of SmProt, which emphasizes reliability of translated sORFs, genetic variants in translated sORFs, disease-specific sORF translation events or sequences, and remarkably increased data volume. More components such as non-ATG translation initiation, function, and new sources are also included. SmProt incorporated 638,958 unique small proteins curated from 3,165,229 primary records, which were computationally predicted from 419 ribosome profiling (Ribo-seq) datasets or collected from literature and other sources from 370 cell lines or tissues in 8 species (Homo sapiens, Mus musculus, Rattus norvegicus, Drosophila melanogaster, Danio rerio, Saccharomyces cerevisiae, Caenorhabditis elegans, and Escherichia coli). In addition, small protein families identified from human microbiomes were also collected. All datasets in SmProt are free to access, and available for browse, search, and bulk downloads at http://bigdata.ibp.ac.cn/SmProt/.
format Online
Article
Text
id pubmed-9039559
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-90395592022-04-27 SmProt: A Reliable Repository with Comprehensive Annotation of Small Proteins Identified from Ribosome Profiling Li, Yanyan Zhou, Honghong Chen, Xiaomin Zheng, Yu Kang, Quan Hao, Di Zhang, Lili Song, Tingrui Luo, Huaxia Hao, Yajing Chen, Runsheng Zhang, Peng He, Shunmin Genomics Proteomics Bioinformatics Database Small proteins specifically refer to proteins consisting of less than 100 amino acids translated from small open reading frames (sORFs), which were usually missed in previous genome annotation. The significance of small proteins has been revealed in current years, along with the discovery of their diverse functions. However, systematic annotation of small proteins is still insufficient. SmProt was specially developed to provide valuable information on small proteins for scientific community. Here we present the update of SmProt, which emphasizes reliability of translated sORFs, genetic variants in translated sORFs, disease-specific sORF translation events or sequences, and remarkably increased data volume. More components such as non-ATG translation initiation, function, and new sources are also included. SmProt incorporated 638,958 unique small proteins curated from 3,165,229 primary records, which were computationally predicted from 419 ribosome profiling (Ribo-seq) datasets or collected from literature and other sources from 370 cell lines or tissues in 8 species (Homo sapiens, Mus musculus, Rattus norvegicus, Drosophila melanogaster, Danio rerio, Saccharomyces cerevisiae, Caenorhabditis elegans, and Escherichia coli). In addition, small protein families identified from human microbiomes were also collected. All datasets in SmProt are free to access, and available for browse, search, and bulk downloads at http://bigdata.ibp.ac.cn/SmProt/. Elsevier 2021-08 2021-09-15 /pmc/articles/PMC9039559/ /pubmed/34536568 http://dx.doi.org/10.1016/j.gpb.2021.09.002 Text en © 2021 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Database
Li, Yanyan
Zhou, Honghong
Chen, Xiaomin
Zheng, Yu
Kang, Quan
Hao, Di
Zhang, Lili
Song, Tingrui
Luo, Huaxia
Hao, Yajing
Chen, Runsheng
Zhang, Peng
He, Shunmin
SmProt: A Reliable Repository with Comprehensive Annotation of Small Proteins Identified from Ribosome Profiling
title SmProt: A Reliable Repository with Comprehensive Annotation of Small Proteins Identified from Ribosome Profiling
title_full SmProt: A Reliable Repository with Comprehensive Annotation of Small Proteins Identified from Ribosome Profiling
title_fullStr SmProt: A Reliable Repository with Comprehensive Annotation of Small Proteins Identified from Ribosome Profiling
title_full_unstemmed SmProt: A Reliable Repository with Comprehensive Annotation of Small Proteins Identified from Ribosome Profiling
title_short SmProt: A Reliable Repository with Comprehensive Annotation of Small Proteins Identified from Ribosome Profiling
title_sort smprot: a reliable repository with comprehensive annotation of small proteins identified from ribosome profiling
topic Database
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9039559/
https://www.ncbi.nlm.nih.gov/pubmed/34536568
http://dx.doi.org/10.1016/j.gpb.2021.09.002
work_keys_str_mv AT liyanyan smprotareliablerepositorywithcomprehensiveannotationofsmallproteinsidentifiedfromribosomeprofiling
AT zhouhonghong smprotareliablerepositorywithcomprehensiveannotationofsmallproteinsidentifiedfromribosomeprofiling
AT chenxiaomin smprotareliablerepositorywithcomprehensiveannotationofsmallproteinsidentifiedfromribosomeprofiling
AT zhengyu smprotareliablerepositorywithcomprehensiveannotationofsmallproteinsidentifiedfromribosomeprofiling
AT kangquan smprotareliablerepositorywithcomprehensiveannotationofsmallproteinsidentifiedfromribosomeprofiling
AT haodi smprotareliablerepositorywithcomprehensiveannotationofsmallproteinsidentifiedfromribosomeprofiling
AT zhanglili smprotareliablerepositorywithcomprehensiveannotationofsmallproteinsidentifiedfromribosomeprofiling
AT songtingrui smprotareliablerepositorywithcomprehensiveannotationofsmallproteinsidentifiedfromribosomeprofiling
AT luohuaxia smprotareliablerepositorywithcomprehensiveannotationofsmallproteinsidentifiedfromribosomeprofiling
AT haoyajing smprotareliablerepositorywithcomprehensiveannotationofsmallproteinsidentifiedfromribosomeprofiling
AT chenrunsheng smprotareliablerepositorywithcomprehensiveannotationofsmallproteinsidentifiedfromribosomeprofiling
AT zhangpeng smprotareliablerepositorywithcomprehensiveannotationofsmallproteinsidentifiedfromribosomeprofiling
AT heshunmin smprotareliablerepositorywithcomprehensiveannotationofsmallproteinsidentifiedfromribosomeprofiling