Cargando…

Prediction and classification of aminoacyl tRNA synthetases using PROSITE domains

BACKGROUND: Aminoacyl tRNA synthetases (aaRSs) catalyse the first step of protein synthesis in all organisms. They are responsible for the precise attachment of amino acids to their cognate transfer RNAs. There are twenty different types of aaRSs, unique for each amino acid. These aaRSs have been di...

Descripción completa

Detalles Bibliográficos
Autores principales: Panwar, Bharat, Raghava, Gajendra PS
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2997003/
https://www.ncbi.nlm.nih.gov/pubmed/20860794
http://dx.doi.org/10.1186/1471-2164-11-507
_version_ 1782193255187742720
author Panwar, Bharat
Raghava, Gajendra PS
author_facet Panwar, Bharat
Raghava, Gajendra PS
author_sort Panwar, Bharat
collection PubMed
description BACKGROUND: Aminoacyl tRNA synthetases (aaRSs) catalyse the first step of protein synthesis in all organisms. They are responsible for the precise attachment of amino acids to their cognate transfer RNAs. There are twenty different types of aaRSs, unique for each amino acid. These aaRSs have been divided into two classes, each comprising ten enzymes. It is important to predict and classify aaRSs in order to understand protein synthesis. RESULTS: In this study, all models were developed on a non-redundant dataset containing 117 aaRSs and an equal number of non-aaRSs, in which no two sequences have more than 30% similarity. First, we applied the similarity search technique, BLAST, and achieved a maximum accuracy of 67.52%. We observed that 62% of tRNA synthetases contain one or more domains from amongst the following four PROSITE domains: PS50862, PS00178, PS50860 and PS50861. An SVM-based model was developed to discriminate between aaRSs, and non-aaRSs, and achieved a maximum MCC of 0.68 with accuracy of 83.73%, using selective dipeptide composition. We developed a hybrid approach and achieved a maximum MCC of 0.72 with accuracy of 85.49%, where SVM model developed using selected dipeptide composition and information of four PROSITE domains. We further developed an SVM-based model for classifying the aaRSs into class-1 and class-2, using selective dipeptide composition and achieved an MCC of 0.79. We also observed that two domains (PS00178, PS50889) in class-1 and three domains (PS50862, PS50860, PS50861) in class-2 were preferred. A hybrid method was developed using these domains as descriptor, along with selected dipeptide composition, and achieved an MCC of 0.87 with a sensitivity of 94.55% and an accuracy of 93.19%. All models were evaluated using a five-fold cross-validation technique. CONCLUSIONS: We have analyzed protein sequences of aaRSs (class-1 and class-2) and non-aaRSs and identified interesting patterns. The high accuracy achieved by our SVM models using selected dipeptide composition demonstrates that certain types of dipeptide are preferred in aaRSs. We were able to identify PROSITE domains that are preferred in aaRSs and their classes, providing interesting insights into tRNA synthetases. The method developed in this study will be useful for researchers studying aaRS enzymes and tRNA biology. The web-server based on the above study, is available at http://www.imtech.res.in/raghava/icaars/.
format Text
id pubmed-2997003
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-29970032011-05-03 Prediction and classification of aminoacyl tRNA synthetases using PROSITE domains Panwar, Bharat Raghava, Gajendra PS BMC Genomics Research Article BACKGROUND: Aminoacyl tRNA synthetases (aaRSs) catalyse the first step of protein synthesis in all organisms. They are responsible for the precise attachment of amino acids to their cognate transfer RNAs. There are twenty different types of aaRSs, unique for each amino acid. These aaRSs have been divided into two classes, each comprising ten enzymes. It is important to predict and classify aaRSs in order to understand protein synthesis. RESULTS: In this study, all models were developed on a non-redundant dataset containing 117 aaRSs and an equal number of non-aaRSs, in which no two sequences have more than 30% similarity. First, we applied the similarity search technique, BLAST, and achieved a maximum accuracy of 67.52%. We observed that 62% of tRNA synthetases contain one or more domains from amongst the following four PROSITE domains: PS50862, PS00178, PS50860 and PS50861. An SVM-based model was developed to discriminate between aaRSs, and non-aaRSs, and achieved a maximum MCC of 0.68 with accuracy of 83.73%, using selective dipeptide composition. We developed a hybrid approach and achieved a maximum MCC of 0.72 with accuracy of 85.49%, where SVM model developed using selected dipeptide composition and information of four PROSITE domains. We further developed an SVM-based model for classifying the aaRSs into class-1 and class-2, using selective dipeptide composition and achieved an MCC of 0.79. We also observed that two domains (PS00178, PS50889) in class-1 and three domains (PS50862, PS50860, PS50861) in class-2 were preferred. A hybrid method was developed using these domains as descriptor, along with selected dipeptide composition, and achieved an MCC of 0.87 with a sensitivity of 94.55% and an accuracy of 93.19%. All models were evaluated using a five-fold cross-validation technique. CONCLUSIONS: We have analyzed protein sequences of aaRSs (class-1 and class-2) and non-aaRSs and identified interesting patterns. The high accuracy achieved by our SVM models using selected dipeptide composition demonstrates that certain types of dipeptide are preferred in aaRSs. We were able to identify PROSITE domains that are preferred in aaRSs and their classes, providing interesting insights into tRNA synthetases. The method developed in this study will be useful for researchers studying aaRS enzymes and tRNA biology. The web-server based on the above study, is available at http://www.imtech.res.in/raghava/icaars/. BioMed Central 2010-09-22 /pmc/articles/PMC2997003/ /pubmed/20860794 http://dx.doi.org/10.1186/1471-2164-11-507 Text en Copyright ©2010 Panwar and Raghava; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Panwar, Bharat
Raghava, Gajendra PS
Prediction and classification of aminoacyl tRNA synthetases using PROSITE domains
title Prediction and classification of aminoacyl tRNA synthetases using PROSITE domains
title_full Prediction and classification of aminoacyl tRNA synthetases using PROSITE domains
title_fullStr Prediction and classification of aminoacyl tRNA synthetases using PROSITE domains
title_full_unstemmed Prediction and classification of aminoacyl tRNA synthetases using PROSITE domains
title_short Prediction and classification of aminoacyl tRNA synthetases using PROSITE domains
title_sort prediction and classification of aminoacyl trna synthetases using prosite domains
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2997003/
https://www.ncbi.nlm.nih.gov/pubmed/20860794
http://dx.doi.org/10.1186/1471-2164-11-507
work_keys_str_mv AT panwarbharat predictionandclassificationofaminoacyltrnasynthetasesusingprositedomains
AT raghavagajendraps predictionandclassificationofaminoacyltrnasynthetasesusingprositedomains