Cargando…
RNA-NRD: a non-redundant RNA structural dataset for benchmarking and functional analysis
The significance of RNA functions and their role in evolution and disease control have remarkably increased the research scope in the field of RNA science. Though the availability of RNA structure data in PBD has been growing tremendously, maintaining their quality and integrity has become the great...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10132383/ https://www.ncbi.nlm.nih.gov/pubmed/37123530 http://dx.doi.org/10.1093/nargab/lqad040 |
_version_ | 1785031377361567744 |
---|---|
author | Khan, Nabila Shahnaz Rahaman, Md Mahfuzur Islam, Shahidul Zhang, Shaojie |
author_facet | Khan, Nabila Shahnaz Rahaman, Md Mahfuzur Islam, Shahidul Zhang, Shaojie |
author_sort | Khan, Nabila Shahnaz |
collection | PubMed |
description | The significance of RNA functions and their role in evolution and disease control have remarkably increased the research scope in the field of RNA science. Though the availability of RNA structure data in PBD has been growing tremendously, maintaining their quality and integrity has become the greater challenge. Since the data available in PDB are results of different independent research, they might contain redundancy. As a result, there remains a possibility of data bias for both protein and RNA chains. Quite a few studies have been conducted to remove the redundancy of protein structures by introducing high-quality representatives. However, the amount of research done to remove the redundancy of RNA structures is still very low. To remove RNA chain redundancy in PDB, we have introduced RNA-NRD, a non-redundant dataset of RNA chains based on sequence and 3D structural similarity. We compared RNA-NRD with the existing non-redundant RNA structure dataset RS-RNA and showed that it has better-formed clusters of redundant RNA chains with lower average RMSD and higher average PSI, thus improving the overall quality of the dataset. |
format | Online Article Text |
id | pubmed-10132383 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-101323832023-04-27 RNA-NRD: a non-redundant RNA structural dataset for benchmarking and functional analysis Khan, Nabila Shahnaz Rahaman, Md Mahfuzur Islam, Shahidul Zhang, Shaojie NAR Genom Bioinform Standard Article The significance of RNA functions and their role in evolution and disease control have remarkably increased the research scope in the field of RNA science. Though the availability of RNA structure data in PBD has been growing tremendously, maintaining their quality and integrity has become the greater challenge. Since the data available in PDB are results of different independent research, they might contain redundancy. As a result, there remains a possibility of data bias for both protein and RNA chains. Quite a few studies have been conducted to remove the redundancy of protein structures by introducing high-quality representatives. However, the amount of research done to remove the redundancy of RNA structures is still very low. To remove RNA chain redundancy in PDB, we have introduced RNA-NRD, a non-redundant dataset of RNA chains based on sequence and 3D structural similarity. We compared RNA-NRD with the existing non-redundant RNA structure dataset RS-RNA and showed that it has better-formed clusters of redundant RNA chains with lower average RMSD and higher average PSI, thus improving the overall quality of the dataset. Oxford University Press 2023-04-26 /pmc/articles/PMC10132383/ /pubmed/37123530 http://dx.doi.org/10.1093/nargab/lqad040 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Standard Article Khan, Nabila Shahnaz Rahaman, Md Mahfuzur Islam, Shahidul Zhang, Shaojie RNA-NRD: a non-redundant RNA structural dataset for benchmarking and functional analysis |
title | RNA-NRD: a non-redundant RNA structural dataset for benchmarking and functional analysis |
title_full | RNA-NRD: a non-redundant RNA structural dataset for benchmarking and functional analysis |
title_fullStr | RNA-NRD: a non-redundant RNA structural dataset for benchmarking and functional analysis |
title_full_unstemmed | RNA-NRD: a non-redundant RNA structural dataset for benchmarking and functional analysis |
title_short | RNA-NRD: a non-redundant RNA structural dataset for benchmarking and functional analysis |
title_sort | rna-nrd: a non-redundant rna structural dataset for benchmarking and functional analysis |
topic | Standard Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10132383/ https://www.ncbi.nlm.nih.gov/pubmed/37123530 http://dx.doi.org/10.1093/nargab/lqad040 |
work_keys_str_mv | AT khannabilashahnaz rnanrdanonredundantrnastructuraldatasetforbenchmarkingandfunctionalanalysis AT rahamanmdmahfuzur rnanrdanonredundantrnastructuraldatasetforbenchmarkingandfunctionalanalysis AT islamshahidul rnanrdanonredundantrnastructuraldatasetforbenchmarkingandfunctionalanalysis AT zhangshaojie rnanrdanonredundantrnastructuraldatasetforbenchmarkingandfunctionalanalysis |