Cargando…
Top-Down Crawl: a method for the ultra-rapid and motif-free alignment of sequences with associated binding metrics
SUMMARY: Several high-throughput protein–DNA binding methods currently available produce highly reproducible measurements of binding affinity at the level of the k-mer. However, understanding where a k-mer is positioned along a binding site sequence depends on alignment. Here, we present Top-Down Cr...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9665867/ https://www.ncbi.nlm.nih.gov/pubmed/36179084 http://dx.doi.org/10.1093/bioinformatics/btac653 |
_version_ | 1784831380179386368 |
---|---|
author | Cooper, Brendon H Chiu, Tsu-Pei Rohs, Remo |
author_facet | Cooper, Brendon H Chiu, Tsu-Pei Rohs, Remo |
author_sort | Cooper, Brendon H |
collection | PubMed |
description | SUMMARY: Several high-throughput protein–DNA binding methods currently available produce highly reproducible measurements of binding affinity at the level of the k-mer. However, understanding where a k-mer is positioned along a binding site sequence depends on alignment. Here, we present Top-Down Crawl (TDC), an ultra-rapid tool designed for the alignment of k-mer level data in a rank-dependent and position weight matrix (PWM)-independent manner. As the framework only depends on the rank of the input, the method can accept input from many types of experiments (protein binding microarray, SELEX-seq, SMiLE-seq, etc.) without the need for specialized parameterization. Measuring the performance of the alignment using multiple linear regression with 5-fold cross-validation, we find TDC to perform as well as or better than computationally expensive PWM-based methods. AVAILABILITY AND IMPLEMENTATION: TDC can be run online at https://topdowncrawl.usc.edu or locally as a python package available through pip at https://pypi.org/project/TopDownCrawl. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-9665867 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-96658672022-11-16 Top-Down Crawl: a method for the ultra-rapid and motif-free alignment of sequences with associated binding metrics Cooper, Brendon H Chiu, Tsu-Pei Rohs, Remo Bioinformatics Applications Note SUMMARY: Several high-throughput protein–DNA binding methods currently available produce highly reproducible measurements of binding affinity at the level of the k-mer. However, understanding where a k-mer is positioned along a binding site sequence depends on alignment. Here, we present Top-Down Crawl (TDC), an ultra-rapid tool designed for the alignment of k-mer level data in a rank-dependent and position weight matrix (PWM)-independent manner. As the framework only depends on the rank of the input, the method can accept input from many types of experiments (protein binding microarray, SELEX-seq, SMiLE-seq, etc.) without the need for specialized parameterization. Measuring the performance of the alignment using multiple linear regression with 5-fold cross-validation, we find TDC to perform as well as or better than computationally expensive PWM-based methods. AVAILABILITY AND IMPLEMENTATION: TDC can be run online at https://topdowncrawl.usc.edu or locally as a python package available through pip at https://pypi.org/project/TopDownCrawl. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-09-30 /pmc/articles/PMC9665867/ /pubmed/36179084 http://dx.doi.org/10.1093/bioinformatics/btac653 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Applications Note Cooper, Brendon H Chiu, Tsu-Pei Rohs, Remo Top-Down Crawl: a method for the ultra-rapid and motif-free alignment of sequences with associated binding metrics |
title | Top-Down Crawl: a method for the ultra-rapid and motif-free alignment of sequences with associated binding metrics |
title_full | Top-Down Crawl: a method for the ultra-rapid and motif-free alignment of sequences with associated binding metrics |
title_fullStr | Top-Down Crawl: a method for the ultra-rapid and motif-free alignment of sequences with associated binding metrics |
title_full_unstemmed | Top-Down Crawl: a method for the ultra-rapid and motif-free alignment of sequences with associated binding metrics |
title_short | Top-Down Crawl: a method for the ultra-rapid and motif-free alignment of sequences with associated binding metrics |
title_sort | top-down crawl: a method for the ultra-rapid and motif-free alignment of sequences with associated binding metrics |
topic | Applications Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9665867/ https://www.ncbi.nlm.nih.gov/pubmed/36179084 http://dx.doi.org/10.1093/bioinformatics/btac653 |
work_keys_str_mv | AT cooperbrendonh topdowncrawlamethodfortheultrarapidandmotiffreealignmentofsequenceswithassociatedbindingmetrics AT chiutsupei topdowncrawlamethodfortheultrarapidandmotiffreealignmentofsequenceswithassociatedbindingmetrics AT rohsremo topdowncrawlamethodfortheultrarapidandmotiffreealignmentofsequenceswithassociatedbindingmetrics |