Cargando…

A unified approach to sequential and non-sequential structure alignment of proteins, RNAs, and DNAs

Many distantly related structure pairs exhibit structural similarities that can only be fully captured by a non-sequential alignment program. We present US-align2, a unified protocol for both sequential and non-sequential alignment of proteins and nucleic acids. On manually curated reference alignme...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Chengxin, Pyle, Anna Marie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9557024/
https://www.ncbi.nlm.nih.gov/pubmed/36248743
http://dx.doi.org/10.1016/j.isci.2022.105218
_version_ 1784807209199206400
author Zhang, Chengxin
Pyle, Anna Marie
author_facet Zhang, Chengxin
Pyle, Anna Marie
author_sort Zhang, Chengxin
collection PubMed
description Many distantly related structure pairs exhibit structural similarities that can only be fully captured by a non-sequential alignment program. We present US-align2, a unified protocol for both sequential and non-sequential alignment of proteins and nucleic acids. On manually curated reference alignments for protein structural pairs with non-sequential relations, US-align2 achieves ≥13% higher agreement with reference alignments than existing sequential and non-sequential alignment methods. Non-sequential alignments also enabled US-align2 to have higher sensitivities in detecting RNA pairs from the same family with sequence identities <40%, obtaining ≥9% higher area under the receiver operating characteristic curve than third-party programs. The unique ability of US-align2 to parse both proteins and nucleic acids allows the method to detect protein-RNA and protein-DNA mimicries. Additionally, US-align2 performs full and semi-non-sequential alignments with at least 48% and 14% faster speed than existing programs for the same tasks, making it particularly useful for large-scale structural similarity detection.
format Online
Article
Text
id pubmed-9557024
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-95570242022-10-14 A unified approach to sequential and non-sequential structure alignment of proteins, RNAs, and DNAs Zhang, Chengxin Pyle, Anna Marie iScience Article Many distantly related structure pairs exhibit structural similarities that can only be fully captured by a non-sequential alignment program. We present US-align2, a unified protocol for both sequential and non-sequential alignment of proteins and nucleic acids. On manually curated reference alignments for protein structural pairs with non-sequential relations, US-align2 achieves ≥13% higher agreement with reference alignments than existing sequential and non-sequential alignment methods. Non-sequential alignments also enabled US-align2 to have higher sensitivities in detecting RNA pairs from the same family with sequence identities <40%, obtaining ≥9% higher area under the receiver operating characteristic curve than third-party programs. The unique ability of US-align2 to parse both proteins and nucleic acids allows the method to detect protein-RNA and protein-DNA mimicries. Additionally, US-align2 performs full and semi-non-sequential alignments with at least 48% and 14% faster speed than existing programs for the same tasks, making it particularly useful for large-scale structural similarity detection. Elsevier 2022-09-28 /pmc/articles/PMC9557024/ /pubmed/36248743 http://dx.doi.org/10.1016/j.isci.2022.105218 Text en © 2022 The Authors https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Zhang, Chengxin
Pyle, Anna Marie
A unified approach to sequential and non-sequential structure alignment of proteins, RNAs, and DNAs
title A unified approach to sequential and non-sequential structure alignment of proteins, RNAs, and DNAs
title_full A unified approach to sequential and non-sequential structure alignment of proteins, RNAs, and DNAs
title_fullStr A unified approach to sequential and non-sequential structure alignment of proteins, RNAs, and DNAs
title_full_unstemmed A unified approach to sequential and non-sequential structure alignment of proteins, RNAs, and DNAs
title_short A unified approach to sequential and non-sequential structure alignment of proteins, RNAs, and DNAs
title_sort unified approach to sequential and non-sequential structure alignment of proteins, rnas, and dnas
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9557024/
https://www.ncbi.nlm.nih.gov/pubmed/36248743
http://dx.doi.org/10.1016/j.isci.2022.105218
work_keys_str_mv AT zhangchengxin aunifiedapproachtosequentialandnonsequentialstructurealignmentofproteinsrnasanddnas
AT pyleannamarie aunifiedapproachtosequentialandnonsequentialstructurealignmentofproteinsrnasanddnas
AT zhangchengxin unifiedapproachtosequentialandnonsequentialstructurealignmentofproteinsrnasanddnas
AT pyleannamarie unifiedapproachtosequentialandnonsequentialstructurealignmentofproteinsrnasanddnas