Cargando…

Efficient algorithms for analyzing segmental duplications with deletions and inversions in genomes

BACKGROUND: Segmental duplications, or low-copy repeats, are common in mammalian genomes. In the human genome, most segmental duplications are mosaics comprised of multiple duplicated fragments. This complex genomic organization complicates analysis of the evolutionary history of these sequences. On...

Descripción completa

Detalles Bibliográficos
Autores principales: Kahn, Crystal L, Mozes, Shay, Raphael, Benjamin J
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2820476/
https://www.ncbi.nlm.nih.gov/pubmed/20047668
http://dx.doi.org/10.1186/1748-7188-5-11
_version_ 1782177377815625728
author Kahn, Crystal L
Mozes, Shay
Raphael, Benjamin J
author_facet Kahn, Crystal L
Mozes, Shay
Raphael, Benjamin J
author_sort Kahn, Crystal L
collection PubMed
description BACKGROUND: Segmental duplications, or low-copy repeats, are common in mammalian genomes. In the human genome, most segmental duplications are mosaics comprised of multiple duplicated fragments. This complex genomic organization complicates analysis of the evolutionary history of these sequences. One model proposed to explain this mosaic patterns is a model of repeated aggregation and subsequent duplication of genomic sequences. RESULTS: We describe a polynomial-time exact algorithm to compute duplication distance, a genomic distance defined as the most parsimonious way to build a target string by repeatedly copying substrings of a fixed source string. This distance models the process of repeated aggregation and duplication. We also describe extensions of this distance to include certain types of substring deletions and inversions. Finally, we provide a description of a sequence of duplication events as a context-free grammar (CFG). CONCLUSION: These new genomic distances will permit more biologically realistic analyses of segmental duplications in genomes.
format Text
id pubmed-2820476
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-28204762010-02-12 Efficient algorithms for analyzing segmental duplications with deletions and inversions in genomes Kahn, Crystal L Mozes, Shay Raphael, Benjamin J Algorithms Mol Biol Research BACKGROUND: Segmental duplications, or low-copy repeats, are common in mammalian genomes. In the human genome, most segmental duplications are mosaics comprised of multiple duplicated fragments. This complex genomic organization complicates analysis of the evolutionary history of these sequences. One model proposed to explain this mosaic patterns is a model of repeated aggregation and subsequent duplication of genomic sequences. RESULTS: We describe a polynomial-time exact algorithm to compute duplication distance, a genomic distance defined as the most parsimonious way to build a target string by repeatedly copying substrings of a fixed source string. This distance models the process of repeated aggregation and duplication. We also describe extensions of this distance to include certain types of substring deletions and inversions. Finally, we provide a description of a sequence of duplication events as a context-free grammar (CFG). CONCLUSION: These new genomic distances will permit more biologically realistic analyses of segmental duplications in genomes. BioMed Central 2010-01-04 /pmc/articles/PMC2820476/ /pubmed/20047668 http://dx.doi.org/10.1186/1748-7188-5-11 Text en Copyright ©2010 Kahn et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Kahn, Crystal L
Mozes, Shay
Raphael, Benjamin J
Efficient algorithms for analyzing segmental duplications with deletions and inversions in genomes
title Efficient algorithms for analyzing segmental duplications with deletions and inversions in genomes
title_full Efficient algorithms for analyzing segmental duplications with deletions and inversions in genomes
title_fullStr Efficient algorithms for analyzing segmental duplications with deletions and inversions in genomes
title_full_unstemmed Efficient algorithms for analyzing segmental duplications with deletions and inversions in genomes
title_short Efficient algorithms for analyzing segmental duplications with deletions and inversions in genomes
title_sort efficient algorithms for analyzing segmental duplications with deletions and inversions in genomes
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2820476/
https://www.ncbi.nlm.nih.gov/pubmed/20047668
http://dx.doi.org/10.1186/1748-7188-5-11
work_keys_str_mv AT kahncrystall efficientalgorithmsforanalyzingsegmentalduplicationswithdeletionsandinversionsingenomes
AT mozesshay efficientalgorithmsforanalyzingsegmentalduplicationswithdeletionsandinversionsingenomes
AT raphaelbenjaminj efficientalgorithmsforanalyzingsegmentalduplicationswithdeletionsandinversionsingenomes