Cargando…
A generative model for constructing nucleic acid sequences binding to a protein
BACKGROUND: Interactions between protein and nucleic acid molecules are essential to a variety of cellular processes. A large amount of interaction data generated by high-throughput technologies have triggered the development of several computational methods either to predict binding sites in a sequ...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6933682/ https://www.ncbi.nlm.nih.gov/pubmed/31881936 http://dx.doi.org/10.1186/s12864-019-6299-4 |
_version_ | 1783483257292062720 |
---|---|
author | Im, Jinho Park, Byungkyu Han, Kyungsook |
author_facet | Im, Jinho Park, Byungkyu Han, Kyungsook |
author_sort | Im, Jinho |
collection | PubMed |
description | BACKGROUND: Interactions between protein and nucleic acid molecules are essential to a variety of cellular processes. A large amount of interaction data generated by high-throughput technologies have triggered the development of several computational methods either to predict binding sites in a sequence or to determine whether a pair of sequences interacts or not. Most of these methods treat the problem of the interaction of nucleic acids with proteins as a classification problem rather than a generation problem. RESULTS: We developed a generative model for constructing single-stranded nucleic acids binding to a target protein using a long short-term memory (LSTM) neural network. Experimental results of the generative model are promising in the sense that DNA and RNA sequences generated by the model for several target proteins show high specificity and that motifs present in the generated sequences are similar to known protein-binding motifs. CONCLUSIONS: Although these are preliminary results of our ongoing research, our approach can be used to generate nucleic acid sequences binding to a target protein. In particular, it will help design efficient in vitro experiments by constructing an initial pool of potential aptamers that bind to a target protein with high affinity and specificity. |
format | Online Article Text |
id | pubmed-6933682 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-69336822019-12-30 A generative model for constructing nucleic acid sequences binding to a protein Im, Jinho Park, Byungkyu Han, Kyungsook BMC Genomics Research BACKGROUND: Interactions between protein and nucleic acid molecules are essential to a variety of cellular processes. A large amount of interaction data generated by high-throughput technologies have triggered the development of several computational methods either to predict binding sites in a sequence or to determine whether a pair of sequences interacts or not. Most of these methods treat the problem of the interaction of nucleic acids with proteins as a classification problem rather than a generation problem. RESULTS: We developed a generative model for constructing single-stranded nucleic acids binding to a target protein using a long short-term memory (LSTM) neural network. Experimental results of the generative model are promising in the sense that DNA and RNA sequences generated by the model for several target proteins show high specificity and that motifs present in the generated sequences are similar to known protein-binding motifs. CONCLUSIONS: Although these are preliminary results of our ongoing research, our approach can be used to generate nucleic acid sequences binding to a target protein. In particular, it will help design efficient in vitro experiments by constructing an initial pool of potential aptamers that bind to a target protein with high affinity and specificity. BioMed Central 2019-12-27 /pmc/articles/PMC6933682/ /pubmed/31881936 http://dx.doi.org/10.1186/s12864-019-6299-4 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Im, Jinho Park, Byungkyu Han, Kyungsook A generative model for constructing nucleic acid sequences binding to a protein |
title | A generative model for constructing nucleic acid sequences binding to a protein |
title_full | A generative model for constructing nucleic acid sequences binding to a protein |
title_fullStr | A generative model for constructing nucleic acid sequences binding to a protein |
title_full_unstemmed | A generative model for constructing nucleic acid sequences binding to a protein |
title_short | A generative model for constructing nucleic acid sequences binding to a protein |
title_sort | generative model for constructing nucleic acid sequences binding to a protein |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6933682/ https://www.ncbi.nlm.nih.gov/pubmed/31881936 http://dx.doi.org/10.1186/s12864-019-6299-4 |
work_keys_str_mv | AT imjinho agenerativemodelforconstructingnucleicacidsequencesbindingtoaprotein AT parkbyungkyu agenerativemodelforconstructingnucleicacidsequencesbindingtoaprotein AT hankyungsook agenerativemodelforconstructingnucleicacidsequencesbindingtoaprotein AT imjinho generativemodelforconstructingnucleicacidsequencesbindingtoaprotein AT parkbyungkyu generativemodelforconstructingnucleicacidsequencesbindingtoaprotein AT hankyungsook generativemodelforconstructingnucleicacidsequencesbindingtoaprotein |