Cargando…
Seqpare: a novel metric of similarity between genomic interval sets
Searching genomic interval sets produced by sequencing methods has been widely and routinely performed; however, existing metrics for quantifying similarities among interval sets are inconsistent. Here we introduce Seqpare, a self-consistent and effective metric of similarity and tool for comparing...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
F1000 Research Limited
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7808057/ https://www.ncbi.nlm.nih.gov/pubmed/33500773 http://dx.doi.org/10.12688/f1000research.23390.2 |
_version_ | 1783636841995436032 |
---|---|
author | Feng, Selena C. Sheffield, Nathan C. Feng, Jianglin |
author_facet | Feng, Selena C. Sheffield, Nathan C. Feng, Jianglin |
author_sort | Feng, Selena C. |
collection | PubMed |
description | Searching genomic interval sets produced by sequencing methods has been widely and routinely performed; however, existing metrics for quantifying similarities among interval sets are inconsistent. Here we introduce Seqpare, a self-consistent and effective metric of similarity and tool for comparing sequences based on their interval sets. With this metric, the similarity of two interval sets is quantified by a single index, the ratio of their effective overlap over the union: an index of zero indicates unrelated interval sets, and an index of one means that the interval sets are identical. Analysis and tests confirm the effectiveness and self-consistency of the Seqpare metric. |
format | Online Article Text |
id | pubmed-7808057 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | F1000 Research Limited |
record_format | MEDLINE/PubMed |
spelling | pubmed-78080572021-01-25 Seqpare: a novel metric of similarity between genomic interval sets Feng, Selena C. Sheffield, Nathan C. Feng, Jianglin F1000Res Software Tool Article Searching genomic interval sets produced by sequencing methods has been widely and routinely performed; however, existing metrics for quantifying similarities among interval sets are inconsistent. Here we introduce Seqpare, a self-consistent and effective metric of similarity and tool for comparing sequences based on their interval sets. With this metric, the similarity of two interval sets is quantified by a single index, the ratio of their effective overlap over the union: an index of zero indicates unrelated interval sets, and an index of one means that the interval sets are identical. Analysis and tests confirm the effectiveness and self-consistency of the Seqpare metric. F1000 Research Limited 2021-01-04 /pmc/articles/PMC7808057/ /pubmed/33500773 http://dx.doi.org/10.12688/f1000research.23390.2 Text en Copyright: © 2021 Feng SC et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Software Tool Article Feng, Selena C. Sheffield, Nathan C. Feng, Jianglin Seqpare: a novel metric of similarity between genomic interval sets |
title |
Seqpare: a novel metric of similarity between genomic interval sets |
title_full |
Seqpare: a novel metric of similarity between genomic interval sets |
title_fullStr |
Seqpare: a novel metric of similarity between genomic interval sets |
title_full_unstemmed |
Seqpare: a novel metric of similarity between genomic interval sets |
title_short |
Seqpare: a novel metric of similarity between genomic interval sets |
title_sort | seqpare: a novel metric of similarity between genomic interval sets |
topic | Software Tool Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7808057/ https://www.ncbi.nlm.nih.gov/pubmed/33500773 http://dx.doi.org/10.12688/f1000research.23390.2 |
work_keys_str_mv | AT fengselenac seqpareanovelmetricofsimilaritybetweengenomicintervalsets AT sheffieldnathanc seqpareanovelmetricofsimilaritybetweengenomicintervalsets AT fengjianglin seqpareanovelmetricofsimilaritybetweengenomicintervalsets |