Cargando…
Detailed analysis of putative genes encoding small proteins in legume genomes
Diverse plant genome sequencing projects coupled with powerful bioinformatics tools have facilitated massive data analysis to construct specialized databases classified according to cellular function. However, there are still a considerable number of genes encoding proteins whose function has not ye...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3687714/ https://www.ncbi.nlm.nih.gov/pubmed/23802007 http://dx.doi.org/10.3389/fpls.2013.00208 |
_version_ | 1782273974962487296 |
---|---|
author | Guillén, Gabriel Díaz-Camino, Claudia Loyola-Torres, Carlos A. Aparicio-Fabre, Rosaura Hernández-López, Alejandrina Díaz-Sánchez, Mauricio Sanchez, Federico |
author_facet | Guillén, Gabriel Díaz-Camino, Claudia Loyola-Torres, Carlos A. Aparicio-Fabre, Rosaura Hernández-López, Alejandrina Díaz-Sánchez, Mauricio Sanchez, Federico |
author_sort | Guillén, Gabriel |
collection | PubMed |
description | Diverse plant genome sequencing projects coupled with powerful bioinformatics tools have facilitated massive data analysis to construct specialized databases classified according to cellular function. However, there are still a considerable number of genes encoding proteins whose function has not yet been characterized. Included in this category are small proteins (SPs, 30–150 amino acids) encoded by short open reading frames (sORFs). SPs play important roles in plant physiology, growth, and development. Unfortunately, protocols focused on the genome-wide identification and characterization of sORFs are scarce or remain poorly implemented. As a result, these genes are underrepresented in many genome annotations. In this work, we exploited publicly available genome sequences of Phaseolus vulgaris, Medicago truncatula, Glycine max, and Lotus japonicus to analyze the abundance of annotated SPs in plant legumes. Our strategy to uncover bona fide sORFs at the genome level was centered in bioinformatics analysis of characteristics such as evidence of expression (transcription), presence of known protein regions or domains, and identification of orthologous genes in the genomes explored. We collected 6170, 10,461, 30,521, and 23,599 putative sORFs from P. vulgaris, G. max, M. truncatula, and L. japonicus genomes, respectively. Expressed sequence tags (ESTs) available in the DFCI Gene Index database provided evidence that ~one-third of the predicted legume sORFs are expressed. Most potential SPs have a counterpart in a different plant species and counterpart regions or domains in larger proteins. Potential functional sORFs were also classified according to a reduced set of GO categories, and the expression of 13 of them during P. vulgaris nodule ontogeny was confirmed by qPCR. This analysis provides a collection of sORFs that potentially encode for meaningful SPs, and offers the possibility of their further functional evaluation. |
format | Online Article Text |
id | pubmed-3687714 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-36877142013-06-25 Detailed analysis of putative genes encoding small proteins in legume genomes Guillén, Gabriel Díaz-Camino, Claudia Loyola-Torres, Carlos A. Aparicio-Fabre, Rosaura Hernández-López, Alejandrina Díaz-Sánchez, Mauricio Sanchez, Federico Front Plant Sci Plant Science Diverse plant genome sequencing projects coupled with powerful bioinformatics tools have facilitated massive data analysis to construct specialized databases classified according to cellular function. However, there are still a considerable number of genes encoding proteins whose function has not yet been characterized. Included in this category are small proteins (SPs, 30–150 amino acids) encoded by short open reading frames (sORFs). SPs play important roles in plant physiology, growth, and development. Unfortunately, protocols focused on the genome-wide identification and characterization of sORFs are scarce or remain poorly implemented. As a result, these genes are underrepresented in many genome annotations. In this work, we exploited publicly available genome sequences of Phaseolus vulgaris, Medicago truncatula, Glycine max, and Lotus japonicus to analyze the abundance of annotated SPs in plant legumes. Our strategy to uncover bona fide sORFs at the genome level was centered in bioinformatics analysis of characteristics such as evidence of expression (transcription), presence of known protein regions or domains, and identification of orthologous genes in the genomes explored. We collected 6170, 10,461, 30,521, and 23,599 putative sORFs from P. vulgaris, G. max, M. truncatula, and L. japonicus genomes, respectively. Expressed sequence tags (ESTs) available in the DFCI Gene Index database provided evidence that ~one-third of the predicted legume sORFs are expressed. Most potential SPs have a counterpart in a different plant species and counterpart regions or domains in larger proteins. Potential functional sORFs were also classified according to a reduced set of GO categories, and the expression of 13 of them during P. vulgaris nodule ontogeny was confirmed by qPCR. This analysis provides a collection of sORFs that potentially encode for meaningful SPs, and offers the possibility of their further functional evaluation. Frontiers Media S.A. 2013-06-20 /pmc/articles/PMC3687714/ /pubmed/23802007 http://dx.doi.org/10.3389/fpls.2013.00208 Text en Copyright © 2013 Guillén, Díaz-Camino, Loyola-Torres, Aparicio-Fabre, Hernández-López, Díaz-Sánchez and Sanchez. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc. |
spellingShingle | Plant Science Guillén, Gabriel Díaz-Camino, Claudia Loyola-Torres, Carlos A. Aparicio-Fabre, Rosaura Hernández-López, Alejandrina Díaz-Sánchez, Mauricio Sanchez, Federico Detailed analysis of putative genes encoding small proteins in legume genomes |
title | Detailed analysis of putative genes encoding small proteins in legume genomes |
title_full | Detailed analysis of putative genes encoding small proteins in legume genomes |
title_fullStr | Detailed analysis of putative genes encoding small proteins in legume genomes |
title_full_unstemmed | Detailed analysis of putative genes encoding small proteins in legume genomes |
title_short | Detailed analysis of putative genes encoding small proteins in legume genomes |
title_sort | detailed analysis of putative genes encoding small proteins in legume genomes |
topic | Plant Science |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3687714/ https://www.ncbi.nlm.nih.gov/pubmed/23802007 http://dx.doi.org/10.3389/fpls.2013.00208 |
work_keys_str_mv | AT guillengabriel detailedanalysisofputativegenesencodingsmallproteinsinlegumegenomes AT diazcaminoclaudia detailedanalysisofputativegenesencodingsmallproteinsinlegumegenomes AT loyolatorrescarlosa detailedanalysisofputativegenesencodingsmallproteinsinlegumegenomes AT apariciofabrerosaura detailedanalysisofputativegenesencodingsmallproteinsinlegumegenomes AT hernandezlopezalejandrina detailedanalysisofputativegenesencodingsmallproteinsinlegumegenomes AT diazsanchezmauricio detailedanalysisofputativegenesencodingsmallproteinsinlegumegenomes AT sanchezfederico detailedanalysisofputativegenesencodingsmallproteinsinlegumegenomes |