Cargando…

Prediction and classification of aminoacyl tRNA synthetases using PROSITE domains

BACKGROUND: Aminoacyl tRNA synthetases (aaRSs) catalyse the first step of protein synthesis in all organisms. They are responsible for the precise attachment of amino acids to their cognate transfer RNAs. There are twenty different types of aaRSs, unique for each amino acid. These aaRSs have been di...

Descripción completa

Detalles Bibliográficos
Autores principales:	Panwar, Bharat, Raghava, Gajendra PS
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2010
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2997003/ https://www.ncbi.nlm.nih.gov/pubmed/20860794 http://dx.doi.org/10.1186/1471-2164-11-507

_version_	1782193255187742720
author	Panwar, Bharat Raghava, Gajendra PS
author_facet	Panwar, Bharat Raghava, Gajendra PS
author_sort	Panwar, Bharat
collection	PubMed
description	BACKGROUND: Aminoacyl tRNA synthetases (aaRSs) catalyse the first step of protein synthesis in all organisms. They are responsible for the precise attachment of amino acids to their cognate transfer RNAs. There are twenty different types of aaRSs, unique for each amino acid. These aaRSs have been divided into two classes, each comprising ten enzymes. It is important to predict and classify aaRSs in order to understand protein synthesis. RESULTS: In this study, all models were developed on a non-redundant dataset containing 117 aaRSs and an equal number of non-aaRSs, in which no two sequences have more than 30% similarity. First, we applied the similarity search technique, BLAST, and achieved a maximum accuracy of 67.52%. We observed that 62% of tRNA synthetases contain one or more domains from amongst the following four PROSITE domains: PS50862, PS00178, PS50860 and PS50861. An SVM-based model was developed to discriminate between aaRSs, and non-aaRSs, and achieved a maximum MCC of 0.68 with accuracy of 83.73%, using selective dipeptide composition. We developed a hybrid approach and achieved a maximum MCC of 0.72 with accuracy of 85.49%, where SVM model developed using selected dipeptide composition and information of four PROSITE domains. We further developed an SVM-based model for classifying the aaRSs into class-1 and class-2, using selective dipeptide composition and achieved an MCC of 0.79. We also observed that two domains (PS00178, PS50889) in class-1 and three domains (PS50862, PS50860, PS50861) in class-2 were preferred. A hybrid method was developed using these domains as descriptor, along with selected dipeptide composition, and achieved an MCC of 0.87 with a sensitivity of 94.55% and an accuracy of 93.19%. All models were evaluated using a five-fold cross-validation technique. CONCLUSIONS: We have analyzed protein sequences of aaRSs (class-1 and class-2) and non-aaRSs and identified interesting patterns. The high accuracy achieved by our SVM models using selected dipeptide composition demonstrates that certain types of dipeptide are preferred in aaRSs. We were able to identify PROSITE domains that are preferred in aaRSs and their classes, providing interesting insights into tRNA synthetases. The method developed in this study will be useful for researchers studying aaRS enzymes and tRNA biology. The web-server based on the above study, is available at http://www.imtech.res.in/raghava/icaars/.
format	Text
id	pubmed-2997003
institution	National Center for Biotechnology Information
language	English
publishDate	2010
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-29970032011-05-03 Prediction and classification of aminoacyl tRNA synthetases using PROSITE domains Panwar, Bharat Raghava, Gajendra PS BMC Genomics Research Article BACKGROUND: Aminoacyl tRNA synthetases (aaRSs) catalyse the first step of protein synthesis in all organisms. They are responsible for the precise attachment of amino acids to their cognate transfer RNAs. There are twenty different types of aaRSs, unique for each amino acid. These aaRSs have been divided into two classes, each comprising ten enzymes. It is important to predict and classify aaRSs in order to understand protein synthesis. RESULTS: In this study, all models were developed on a non-redundant dataset containing 117 aaRSs and an equal number of non-aaRSs, in which no two sequences have more than 30% similarity. First, we applied the similarity search technique, BLAST, and achieved a maximum accuracy of 67.52%. We observed that 62% of tRNA synthetases contain one or more domains from amongst the following four PROSITE domains: PS50862, PS00178, PS50860 and PS50861. An SVM-based model was developed to discriminate between aaRSs, and non-aaRSs, and achieved a maximum MCC of 0.68 with accuracy of 83.73%, using selective dipeptide composition. We developed a hybrid approach and achieved a maximum MCC of 0.72 with accuracy of 85.49%, where SVM model developed using selected dipeptide composition and information of four PROSITE domains. We further developed an SVM-based model for classifying the aaRSs into class-1 and class-2, using selective dipeptide composition and achieved an MCC of 0.79. We also observed that two domains (PS00178, PS50889) in class-1 and three domains (PS50862, PS50860, PS50861) in class-2 were preferred. A hybrid method was developed using these domains as descriptor, along with selected dipeptide composition, and achieved an MCC of 0.87 with a sensitivity of 94.55% and an accuracy of 93.19%. All models were evaluated using a five-fold cross-validation technique. CONCLUSIONS: We have analyzed protein sequences of aaRSs (class-1 and class-2) and non-aaRSs and identified interesting patterns. The high accuracy achieved by our SVM models using selected dipeptide composition demonstrates that certain types of dipeptide are preferred in aaRSs. We were able to identify PROSITE domains that are preferred in aaRSs and their classes, providing interesting insights into tRNA synthetases. The method developed in this study will be useful for researchers studying aaRS enzymes and tRNA biology. The web-server based on the above study, is available at http://www.imtech.res.in/raghava/icaars/. BioMed Central 2010-09-22 /pmc/articles/PMC2997003/ /pubmed/20860794 http://dx.doi.org/10.1186/1471-2164-11-507 Text en Copyright ©2010 Panwar and Raghava; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Panwar, Bharat Raghava, Gajendra PS Prediction and classification of aminoacyl tRNA synthetases using PROSITE domains
title	Prediction and classification of aminoacyl tRNA synthetases using PROSITE domains
title_full	Prediction and classification of aminoacyl tRNA synthetases using PROSITE domains
title_fullStr	Prediction and classification of aminoacyl tRNA synthetases using PROSITE domains
title_full_unstemmed	Prediction and classification of aminoacyl tRNA synthetases using PROSITE domains
title_short	Prediction and classification of aminoacyl tRNA synthetases using PROSITE domains
title_sort	prediction and classification of aminoacyl trna synthetases using prosite domains
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2997003/ https://www.ncbi.nlm.nih.gov/pubmed/20860794 http://dx.doi.org/10.1186/1471-2164-11-507
work_keys_str_mv	AT panwarbharat predictionandclassificationofaminoacyltrnasynthetasesusingprositedomains AT raghavagajendraps predictionandclassificationofaminoacyltrnasynthetasesusingprositedomains

Prediction and classification of aminoacyl tRNA synthetases using PROSITE domains

Ejemplares similares