Cargando…

Composition bias and the origin of ORFan genes

Motivation: Intriguingly, sequence analysis of genomes reveals that a large number of genes are unique to each organism. The origin of these genes, termed ORFans, is not known. Here, we explore the origin of ORFan genes by defining a simple measure called ‘composition bias’, based on the deviation o...

Descripción completa

Detalles Bibliográficos
Autores principales: Yomtovian, Inbal, Teerakulkittipong, Nuttinee, Lee, Byungkook, Moult, John, Unger, Ron
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2853687/
https://www.ncbi.nlm.nih.gov/pubmed/20231229
http://dx.doi.org/10.1093/bioinformatics/btq093
_version_ 1782180052648067072
author Yomtovian, Inbal
Teerakulkittipong, Nuttinee
Lee, Byungkook
Moult, John
Unger, Ron
author_facet Yomtovian, Inbal
Teerakulkittipong, Nuttinee
Lee, Byungkook
Moult, John
Unger, Ron
author_sort Yomtovian, Inbal
collection PubMed
description Motivation: Intriguingly, sequence analysis of genomes reveals that a large number of genes are unique to each organism. The origin of these genes, termed ORFans, is not known. Here, we explore the origin of ORFan genes by defining a simple measure called ‘composition bias’, based on the deviation of the amino acid composition of a given sequence from the average composition of all proteins of a given genome. Results: For a set of 47 prokaryotic genomes, we show that the amino acid composition bias of real proteins, random ‘proteins’ (created by using the nucleotide frequencies of each genome) and ‘proteins’ translated from intergenic regions are distinct. For ORFans, we observed a correlation between their composition bias and their relative evolutionary age. Recent ORFan proteins have compositions more similar to those of random ‘proteins’, while the compositions of more ancient ORFan proteins are more similar to those of the set of all proteins of the organism. This observation is consistent with an evolutionary scenario wherein ORFan genes emerged and underwent a large number of random mutations and selection, eventually adapting to the composition preference of their organism over time. Contact: ron@biocoml.ls.biu.ac.il Supplementary information: Supplementary data are available at Bioinformatics online.
format Text
id pubmed-2853687
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-28536872010-04-14 Composition bias and the origin of ORFan genes Yomtovian, Inbal Teerakulkittipong, Nuttinee Lee, Byungkook Moult, John Unger, Ron Bioinformatics Discovery Note Motivation: Intriguingly, sequence analysis of genomes reveals that a large number of genes are unique to each organism. The origin of these genes, termed ORFans, is not known. Here, we explore the origin of ORFan genes by defining a simple measure called ‘composition bias’, based on the deviation of the amino acid composition of a given sequence from the average composition of all proteins of a given genome. Results: For a set of 47 prokaryotic genomes, we show that the amino acid composition bias of real proteins, random ‘proteins’ (created by using the nucleotide frequencies of each genome) and ‘proteins’ translated from intergenic regions are distinct. For ORFans, we observed a correlation between their composition bias and their relative evolutionary age. Recent ORFan proteins have compositions more similar to those of random ‘proteins’, while the compositions of more ancient ORFan proteins are more similar to those of the set of all proteins of the organism. This observation is consistent with an evolutionary scenario wherein ORFan genes emerged and underwent a large number of random mutations and selection, eventually adapting to the composition preference of their organism over time. Contact: ron@biocoml.ls.biu.ac.il Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2010-04-15 2010-03-15 /pmc/articles/PMC2853687/ /pubmed/20231229 http://dx.doi.org/10.1093/bioinformatics/btq093 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Discovery Note
Yomtovian, Inbal
Teerakulkittipong, Nuttinee
Lee, Byungkook
Moult, John
Unger, Ron
Composition bias and the origin of ORFan genes
title Composition bias and the origin of ORFan genes
title_full Composition bias and the origin of ORFan genes
title_fullStr Composition bias and the origin of ORFan genes
title_full_unstemmed Composition bias and the origin of ORFan genes
title_short Composition bias and the origin of ORFan genes
title_sort composition bias and the origin of orfan genes
topic Discovery Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2853687/
https://www.ncbi.nlm.nih.gov/pubmed/20231229
http://dx.doi.org/10.1093/bioinformatics/btq093
work_keys_str_mv AT yomtovianinbal compositionbiasandtheoriginoforfangenes
AT teerakulkittipongnuttinee compositionbiasandtheoriginoforfangenes
AT leebyungkook compositionbiasandtheoriginoforfangenes
AT moultjohn compositionbiasandtheoriginoforfangenes
AT ungerron compositionbiasandtheoriginoforfangenes