Cargando…

Seqpare: a novel metric of similarity between genomic interval sets

Searching genomic interval sets produced by sequencing methods has been widely and routinely performed; however, existing metrics for quantifying similarities among interval sets are inconsistent. Here we introduce Seqpare, a self-consistent and effective metric of similarity and tool for comparing...

Descripción completa

Detalles Bibliográficos
Autores principales: Feng, Selena C., Sheffield, Nathan C., Feng, Jianglin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000 Research Limited 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7808057/
https://www.ncbi.nlm.nih.gov/pubmed/33500773
http://dx.doi.org/10.12688/f1000research.23390.2
_version_ 1783636841995436032
author Feng, Selena C.
Sheffield, Nathan C.
Feng, Jianglin
author_facet Feng, Selena C.
Sheffield, Nathan C.
Feng, Jianglin
author_sort Feng, Selena C.
collection PubMed
description Searching genomic interval sets produced by sequencing methods has been widely and routinely performed; however, existing metrics for quantifying similarities among interval sets are inconsistent. Here we introduce Seqpare, a self-consistent and effective metric of similarity and tool for comparing sequences based on their interval sets. With this metric, the similarity of two interval sets is quantified by a single index, the ratio of their effective overlap over the union: an index of zero indicates unrelated interval sets, and an index of one means that the interval sets are identical. Analysis and tests confirm the effectiveness and self-consistency of the Seqpare metric.
format Online
Article
Text
id pubmed-7808057
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher F1000 Research Limited
record_format MEDLINE/PubMed
spelling pubmed-78080572021-01-25 Seqpare: a novel metric of similarity between genomic interval sets Feng, Selena C. Sheffield, Nathan C. Feng, Jianglin F1000Res Software Tool Article Searching genomic interval sets produced by sequencing methods has been widely and routinely performed; however, existing metrics for quantifying similarities among interval sets are inconsistent. Here we introduce Seqpare, a self-consistent and effective metric of similarity and tool for comparing sequences based on their interval sets. With this metric, the similarity of two interval sets is quantified by a single index, the ratio of their effective overlap over the union: an index of zero indicates unrelated interval sets, and an index of one means that the interval sets are identical. Analysis and tests confirm the effectiveness and self-consistency of the Seqpare metric. F1000 Research Limited 2021-01-04 /pmc/articles/PMC7808057/ /pubmed/33500773 http://dx.doi.org/10.12688/f1000research.23390.2 Text en Copyright: © 2021 Feng SC et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software Tool Article
Feng, Selena C.
Sheffield, Nathan C.
Feng, Jianglin
Seqpare: a novel metric of similarity between genomic interval sets
title Seqpare: a novel metric of similarity between genomic interval sets
title_full Seqpare: a novel metric of similarity between genomic interval sets
title_fullStr Seqpare: a novel metric of similarity between genomic interval sets
title_full_unstemmed Seqpare: a novel metric of similarity between genomic interval sets
title_short Seqpare: a novel metric of similarity between genomic interval sets
title_sort seqpare: a novel metric of similarity between genomic interval sets
topic Software Tool Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7808057/
https://www.ncbi.nlm.nih.gov/pubmed/33500773
http://dx.doi.org/10.12688/f1000research.23390.2
work_keys_str_mv AT fengselenac seqpareanovelmetricofsimilaritybetweengenomicintervalsets
AT sheffieldnathanc seqpareanovelmetricofsimilaritybetweengenomicintervalsets
AT fengjianglin seqpareanovelmetricofsimilaritybetweengenomicintervalsets