Cargando…
Computational models with thermodynamic and composition features improve siRNA design
BACKGROUND: Small interfering RNAs (siRNAs) have become an important tool in cell and molecular biology. Reliable design of siRNA molecules is essential for the needs of large functional genomics projects. RESULTS: To improve the design of efficient siRNA molecules, we performed a comparative, therm...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2006
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1431570/ https://www.ncbi.nlm.nih.gov/pubmed/16472402 http://dx.doi.org/10.1186/1471-2105-7-65 |
_version_ | 1782127207777304576 |
---|---|
author | Shabalina, Svetlana A Spiridonov, Alexey N Ogurtsov, Aleksey Y |
author_facet | Shabalina, Svetlana A Spiridonov, Alexey N Ogurtsov, Aleksey Y |
author_sort | Shabalina, Svetlana A |
collection | PubMed |
description | BACKGROUND: Small interfering RNAs (siRNAs) have become an important tool in cell and molecular biology. Reliable design of siRNA molecules is essential for the needs of large functional genomics projects. RESULTS: To improve the design of efficient siRNA molecules, we performed a comparative, thermodynamic and correlation analysis on a heterogeneous set of 653 siRNAs collected from the literature. We used this training set to select siRNA features and optimize computational models. We identified 18 parameters that correlate significantly with silencing efficiency. Some of these parameters characterize only the siRNA sequence, while others involve the whole mRNA. Most importantly, we derived an siRNA position-dependent consensus, and optimized the free-energy difference of the 5' and 3' terminal dinucleotides of the siRNA antisense strand. The position-dependent consensus is based on correlation and t-test analyses of the training set, and accounts for both significantly preferred and avoided nucleotides in all sequence positions. On the training set, the two parameters' correlation with silencing efficiency was 0.5 and 0.36, respectively. Among other features, a dinucleotide content index and the frequency of potential targets for siRNA in the mRNA added predictive power to our model (R = 0.55). We showed that our model is effective for predicting the efficiency of siRNAs at different concentrations. We optimized a neural network model on our training set using three parameters characterizing the siRNA sequence, and predicted efficiencies for the test siRNA dataset recently published by Novartis. On this validation set, the correlation coefficient between predicted and observed efficiency was 0.75. Using the same model, we performed a transcriptome-wide analysis of optimal siRNA targets for 22,600 human mRNAs. CONCLUSION: We demonstrated that the properties of the siRNAs themselves are essential for efficient RNA interference. The 5' ends of antisense strands of efficient siRNAs are U-rich and possess a content similarity to the pyrimidine-rich oligonucleotides interacting with the polypurine RNA tracks that are recognized by RNase H. The advantage of our method over similar methods is the small number of parameters. As a result, our method requires a much smaller training set to produce consistent results. Other mRNA features, though expensive to compute, can slightly improve our model. |
format | Text |
id | pubmed-1431570 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2006 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-14315702006-04-06 Computational models with thermodynamic and composition features improve siRNA design Shabalina, Svetlana A Spiridonov, Alexey N Ogurtsov, Aleksey Y BMC Bioinformatics Methodology Article BACKGROUND: Small interfering RNAs (siRNAs) have become an important tool in cell and molecular biology. Reliable design of siRNA molecules is essential for the needs of large functional genomics projects. RESULTS: To improve the design of efficient siRNA molecules, we performed a comparative, thermodynamic and correlation analysis on a heterogeneous set of 653 siRNAs collected from the literature. We used this training set to select siRNA features and optimize computational models. We identified 18 parameters that correlate significantly with silencing efficiency. Some of these parameters characterize only the siRNA sequence, while others involve the whole mRNA. Most importantly, we derived an siRNA position-dependent consensus, and optimized the free-energy difference of the 5' and 3' terminal dinucleotides of the siRNA antisense strand. The position-dependent consensus is based on correlation and t-test analyses of the training set, and accounts for both significantly preferred and avoided nucleotides in all sequence positions. On the training set, the two parameters' correlation with silencing efficiency was 0.5 and 0.36, respectively. Among other features, a dinucleotide content index and the frequency of potential targets for siRNA in the mRNA added predictive power to our model (R = 0.55). We showed that our model is effective for predicting the efficiency of siRNAs at different concentrations. We optimized a neural network model on our training set using three parameters characterizing the siRNA sequence, and predicted efficiencies for the test siRNA dataset recently published by Novartis. On this validation set, the correlation coefficient between predicted and observed efficiency was 0.75. Using the same model, we performed a transcriptome-wide analysis of optimal siRNA targets for 22,600 human mRNAs. CONCLUSION: We demonstrated that the properties of the siRNAs themselves are essential for efficient RNA interference. The 5' ends of antisense strands of efficient siRNAs are U-rich and possess a content similarity to the pyrimidine-rich oligonucleotides interacting with the polypurine RNA tracks that are recognized by RNase H. The advantage of our method over similar methods is the small number of parameters. As a result, our method requires a much smaller training set to produce consistent results. Other mRNA features, though expensive to compute, can slightly improve our model. BioMed Central 2006-02-12 /pmc/articles/PMC1431570/ /pubmed/16472402 http://dx.doi.org/10.1186/1471-2105-7-65 Text en Copyright © 2006 Shabalina et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Shabalina, Svetlana A Spiridonov, Alexey N Ogurtsov, Aleksey Y Computational models with thermodynamic and composition features improve siRNA design |
title | Computational models with thermodynamic and composition features improve siRNA design |
title_full | Computational models with thermodynamic and composition features improve siRNA design |
title_fullStr | Computational models with thermodynamic and composition features improve siRNA design |
title_full_unstemmed | Computational models with thermodynamic and composition features improve siRNA design |
title_short | Computational models with thermodynamic and composition features improve siRNA design |
title_sort | computational models with thermodynamic and composition features improve sirna design |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1431570/ https://www.ncbi.nlm.nih.gov/pubmed/16472402 http://dx.doi.org/10.1186/1471-2105-7-65 |
work_keys_str_mv | AT shabalinasvetlanaa computationalmodelswiththermodynamicandcompositionfeaturesimprovesirnadesign AT spiridonovalexeyn computationalmodelswiththermodynamicandcompositionfeaturesimprovesirnadesign AT ogurtsovalekseyy computationalmodelswiththermodynamicandcompositionfeaturesimprovesirnadesign |