Cargando…

smORFer: a modular algorithm to detect small ORFs in prokaryotes

Emerging evidence places small proteins (≤50 amino acids) more centrally in physiological processes. Yet, their functional identification and the systematic genome annotation of their cognate small open-reading frames (smORFs) remains challenging both experimentally and computationally. Ribosome pro...

Descripción completa

Detalles Bibliográficos
Autores principales: Bartholomäus, Alexander, Kolte, Baban, Mustafayeva, Ayten, Goebel, Ingrid, Fuchs, Stephan, Benndorf, Dirk, Engelmann, Susanne, Ignatova, Zoya
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8421149/
https://www.ncbi.nlm.nih.gov/pubmed/34125903
http://dx.doi.org/10.1093/nar/gkab477
_version_ 1783749016666767360
author Bartholomäus, Alexander
Kolte, Baban
Mustafayeva, Ayten
Goebel, Ingrid
Fuchs, Stephan
Benndorf, Dirk
Engelmann, Susanne
Ignatova, Zoya
author_facet Bartholomäus, Alexander
Kolte, Baban
Mustafayeva, Ayten
Goebel, Ingrid
Fuchs, Stephan
Benndorf, Dirk
Engelmann, Susanne
Ignatova, Zoya
author_sort Bartholomäus, Alexander
collection PubMed
description Emerging evidence places small proteins (≤50 amino acids) more centrally in physiological processes. Yet, their functional identification and the systematic genome annotation of their cognate small open-reading frames (smORFs) remains challenging both experimentally and computationally. Ribosome profiling or Ribo-Seq (that is a deep sequencing of ribosome-protected fragments) enables detecting of actively translated open-reading frames (ORFs) and empirical annotation of coding sequences (CDSs) using the in-register translation pattern that is characteristic for genuinely translating ribosomes. Multiple identifiers of ORFs that use the 3-nt periodicity in Ribo-Seq data sets have been successful in eukaryotic smORF annotation. They have difficulties evaluating prokaryotic genomes due to the unique architecture (e.g. polycistronic messages, overlapping ORFs, leaderless translation, non-canonical initiation etc.). Here, we present a new algorithm, smORFer, which performs with high accuracy in prokaryotic organisms in detecting putative smORFs. The unique feature of smORFer is that it uses an integrated approach and considers structural features of the genetic sequence along with in-frame translation and uses Fourier transform to convert these parameters into a measurable score to faithfully select smORFs. The algorithm is executed in a modular way, and dependent on the data available for a particular organism, different modules can be selected for smORF search.
format Online
Article
Text
id pubmed-8421149
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-84211492021-09-09 smORFer: a modular algorithm to detect small ORFs in prokaryotes Bartholomäus, Alexander Kolte, Baban Mustafayeva, Ayten Goebel, Ingrid Fuchs, Stephan Benndorf, Dirk Engelmann, Susanne Ignatova, Zoya Nucleic Acids Res Methods Online Emerging evidence places small proteins (≤50 amino acids) more centrally in physiological processes. Yet, their functional identification and the systematic genome annotation of their cognate small open-reading frames (smORFs) remains challenging both experimentally and computationally. Ribosome profiling or Ribo-Seq (that is a deep sequencing of ribosome-protected fragments) enables detecting of actively translated open-reading frames (ORFs) and empirical annotation of coding sequences (CDSs) using the in-register translation pattern that is characteristic for genuinely translating ribosomes. Multiple identifiers of ORFs that use the 3-nt periodicity in Ribo-Seq data sets have been successful in eukaryotic smORF annotation. They have difficulties evaluating prokaryotic genomes due to the unique architecture (e.g. polycistronic messages, overlapping ORFs, leaderless translation, non-canonical initiation etc.). Here, we present a new algorithm, smORFer, which performs with high accuracy in prokaryotic organisms in detecting putative smORFs. The unique feature of smORFer is that it uses an integrated approach and considers structural features of the genetic sequence along with in-frame translation and uses Fourier transform to convert these parameters into a measurable score to faithfully select smORFs. The algorithm is executed in a modular way, and dependent on the data available for a particular organism, different modules can be selected for smORF search. Oxford University Press 2021-06-14 /pmc/articles/PMC8421149/ /pubmed/34125903 http://dx.doi.org/10.1093/nar/gkab477 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of Nucleic Acids Research. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Bartholomäus, Alexander
Kolte, Baban
Mustafayeva, Ayten
Goebel, Ingrid
Fuchs, Stephan
Benndorf, Dirk
Engelmann, Susanne
Ignatova, Zoya
smORFer: a modular algorithm to detect small ORFs in prokaryotes
title smORFer: a modular algorithm to detect small ORFs in prokaryotes
title_full smORFer: a modular algorithm to detect small ORFs in prokaryotes
title_fullStr smORFer: a modular algorithm to detect small ORFs in prokaryotes
title_full_unstemmed smORFer: a modular algorithm to detect small ORFs in prokaryotes
title_short smORFer: a modular algorithm to detect small ORFs in prokaryotes
title_sort smorfer: a modular algorithm to detect small orfs in prokaryotes
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8421149/
https://www.ncbi.nlm.nih.gov/pubmed/34125903
http://dx.doi.org/10.1093/nar/gkab477
work_keys_str_mv AT bartholomausalexander smorferamodularalgorithmtodetectsmallorfsinprokaryotes
AT koltebaban smorferamodularalgorithmtodetectsmallorfsinprokaryotes
AT mustafayevaayten smorferamodularalgorithmtodetectsmallorfsinprokaryotes
AT goebelingrid smorferamodularalgorithmtodetectsmallorfsinprokaryotes
AT fuchsstephan smorferamodularalgorithmtodetectsmallorfsinprokaryotes
AT benndorfdirk smorferamodularalgorithmtodetectsmallorfsinprokaryotes
AT engelmannsusanne smorferamodularalgorithmtodetectsmallorfsinprokaryotes
AT ignatovazoya smorferamodularalgorithmtodetectsmallorfsinprokaryotes