Cargando…

RNA-NRD: a non-redundant RNA structural dataset for benchmarking and functional analysis

The significance of RNA functions and their role in evolution and disease control have remarkably increased the research scope in the field of RNA science. Though the availability of RNA structure data in PBD has been growing tremendously, maintaining their quality and integrity has become the great...

Descripción completa

Detalles Bibliográficos
Autores principales: Khan, Nabila Shahnaz, Rahaman, Md Mahfuzur, Islam, Shahidul, Zhang, Shaojie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10132383/
https://www.ncbi.nlm.nih.gov/pubmed/37123530
http://dx.doi.org/10.1093/nargab/lqad040
_version_ 1785031377361567744
author Khan, Nabila Shahnaz
Rahaman, Md Mahfuzur
Islam, Shahidul
Zhang, Shaojie
author_facet Khan, Nabila Shahnaz
Rahaman, Md Mahfuzur
Islam, Shahidul
Zhang, Shaojie
author_sort Khan, Nabila Shahnaz
collection PubMed
description The significance of RNA functions and their role in evolution and disease control have remarkably increased the research scope in the field of RNA science. Though the availability of RNA structure data in PBD has been growing tremendously, maintaining their quality and integrity has become the greater challenge. Since the data available in PDB are results of different independent research, they might contain redundancy. As a result, there remains a possibility of data bias for both protein and RNA chains. Quite a few studies have been conducted to remove the redundancy of protein structures by introducing high-quality representatives. However, the amount of research done to remove the redundancy of RNA structures is still very low. To remove RNA chain redundancy in PDB, we have introduced RNA-NRD, a non-redundant dataset of RNA chains based on sequence and 3D structural similarity. We compared RNA-NRD with the existing non-redundant RNA structure dataset RS-RNA and showed that it has better-formed clusters of redundant RNA chains with lower average RMSD and higher average PSI, thus improving the overall quality of the dataset.
format Online
Article
Text
id pubmed-10132383
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-101323832023-04-27 RNA-NRD: a non-redundant RNA structural dataset for benchmarking and functional analysis Khan, Nabila Shahnaz Rahaman, Md Mahfuzur Islam, Shahidul Zhang, Shaojie NAR Genom Bioinform Standard Article The significance of RNA functions and their role in evolution and disease control have remarkably increased the research scope in the field of RNA science. Though the availability of RNA structure data in PBD has been growing tremendously, maintaining their quality and integrity has become the greater challenge. Since the data available in PDB are results of different independent research, they might contain redundancy. As a result, there remains a possibility of data bias for both protein and RNA chains. Quite a few studies have been conducted to remove the redundancy of protein structures by introducing high-quality representatives. However, the amount of research done to remove the redundancy of RNA structures is still very low. To remove RNA chain redundancy in PDB, we have introduced RNA-NRD, a non-redundant dataset of RNA chains based on sequence and 3D structural similarity. We compared RNA-NRD with the existing non-redundant RNA structure dataset RS-RNA and showed that it has better-formed clusters of redundant RNA chains with lower average RMSD and higher average PSI, thus improving the overall quality of the dataset. Oxford University Press 2023-04-26 /pmc/articles/PMC10132383/ /pubmed/37123530 http://dx.doi.org/10.1093/nargab/lqad040 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Standard Article
Khan, Nabila Shahnaz
Rahaman, Md Mahfuzur
Islam, Shahidul
Zhang, Shaojie
RNA-NRD: a non-redundant RNA structural dataset for benchmarking and functional analysis
title RNA-NRD: a non-redundant RNA structural dataset for benchmarking and functional analysis
title_full RNA-NRD: a non-redundant RNA structural dataset for benchmarking and functional analysis
title_fullStr RNA-NRD: a non-redundant RNA structural dataset for benchmarking and functional analysis
title_full_unstemmed RNA-NRD: a non-redundant RNA structural dataset for benchmarking and functional analysis
title_short RNA-NRD: a non-redundant RNA structural dataset for benchmarking and functional analysis
title_sort rna-nrd: a non-redundant rna structural dataset for benchmarking and functional analysis
topic Standard Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10132383/
https://www.ncbi.nlm.nih.gov/pubmed/37123530
http://dx.doi.org/10.1093/nargab/lqad040
work_keys_str_mv AT khannabilashahnaz rnanrdanonredundantrnastructuraldatasetforbenchmarkingandfunctionalanalysis
AT rahamanmdmahfuzur rnanrdanonredundantrnastructuraldatasetforbenchmarkingandfunctionalanalysis
AT islamshahidul rnanrdanonredundantrnastructuraldatasetforbenchmarkingandfunctionalanalysis
AT zhangshaojie rnanrdanonredundantrnastructuraldatasetforbenchmarkingandfunctionalanalysis